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OM protein - protein search, using sw model 

Run on: March 22, 2004, 12:01:03 ; Search time 111 Seconds 

(without alignments) 
1476.374 Million cell updates/sec 



Title: US-10-069-541-6 
Perfect score: 2972 



Sequence: 1 MAFHVEGLIAI IVFYLLILL EAFLDVDSSPEGSGTEDNLQ 580 



Scoring table: BLOSUM62 

Gapop 10.0 , Gapext 0.5 

Searched: 1586107 seqs, 282547505 residues 

Total number of hits satisfying chosen parameters: 1586107 



Minimum DB seq length: 0 

Maximum DB seq length: 2000000000 

Post-processing: Minimum Match 0% 

Maximum Match 100% 
Listing first 45 summaries 
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Pred. No. is the number of results predicted by chance to have a 
score greater than or equal to the score of the result being printed, 
and is derived by analysis of the total score distribution. 
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ALIGNMENTS 



RESULT 1 
AAB74665 

ID AAB74665 standard; protein; 580 AA. 
XX 

AC AAB74665; 
XX 

DT 01-JUN-2001 (first entry) 
XX 

DE Human high affinity choline transporter protein. 
XX 

KW High affinity choline transporter; cho-1; Alzheimer's disease; diagnosis. 
XX 

OS Homo sapiens. 
XX 

PN WO200116315-A1. 
XX 



PD 08-MAR-2001. 
XX 

PF 18-AUG-2000; 2000WO- JP005545 . 
XX 

PR 27-AUG-1999; 99 JP-00240642 . 

PR 27-DEC-1999; 99 JP-00368 991 . 
XX 

PA (NISC-) JAPAN SCI & TECHNOLOGY CORP. 
XX 

PI Haga T, Okuda T; 
XX 

DR WPI; 2001-226688/23. 

DR N-PSDB; AAF81712. 
XX 

PT New rat and human spinal cord high affinity choline transporters, useful 

PT in diagnosis of Alzheimer's disease and screening promoters as drugs for 

PT treating Alzheimer's disease. 
XX 

PS Claim 8; Page 76-78; 90pp; Japanese. 
XX 

CC The present sequence represents a human (Homo sapiens) high affinity 

CC choline transporter protein designated cho-1. The cho-1 protein has 

CC nootropic and neuroprotective activities. The cho-1 polynucleotide and 

CC protein can be used for the diagnosis of diseases related to the 

CC expression of cho-1 by comparing the cho-1 polynucleotide sequence in a 

CC sample to that of a control. Drug compositions containing the cho-1 

CC protein or expression promoters or inhibitors of cho-1 are useful for 

CC treating disorders characterised by abnormal levels of cho-1, such as 

CC Alzheimer's disease 
XX 

SQ Sequence 580 AA; 

Query Match 100.0%; Score 2972; DB 4; Length 580; 

Best Local Similarity 100.0%; Pred. No. 3.6e-290; 

Matches 580; Conservative 0; Mismatches 0; Indels 0; Gaps 0; 

Q y 1 MAFHVEGLIAIIVFYLLILLVGIWAAWRTKNSGSAEERSEAIIVGGRDIGLLVGGFTMTA 60 

| | | | | | | | | | | I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I 

Db 1 MAFHVEGLIAIIVFYLLILLVGIWAAWRTKNSGSAEERSEAIIVGGRDIGLLVGGFTMTA 60 

Qy 61 TWVGGGYINGTAEAVYVPGYGLAWAQAPIGYSLSLILGGLFFAKPMRSKGYVTMLDPFQQ 120 

| I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I M I I M I 

Db 61 TWVGGGYINGTAEAVYVPGYGLAWAQAP I GYS L S LI LGGLFFAKPMRS KGYVTMLDP FQQ 120 

Qy 121 IYGKRMGGLLFI PALMGEMFWAAAI FSALGATI SVI I DVDMHI SVI I SALIATLYTLVGG 180 

I I I I I I I I I I | | | | | I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I 

D b 121 IYGKRMGGLLFI PALMGEMFWAAAI FSALGATI SVI I DVDMHI SVI I SALIATLYTLVGG 180 

Qy 181 L Y S VAYT DWQ L FC I FVGLW I S VP FAL S H P AVAD I G FT AVHAK YQ K P W L GT VD S S EVY S W 240 

| | | | | | | M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

D b 181 L Y S VAYT DWQL FC I FVGLWI S VP FALS H P AVAD I G FT AVHAK YQKPWLGTVD S S EVYSW 240 

Q y 241 LDS FLLLMLGGI PWQAYFQRVLS S S SATYAQVLS FLAAFGCLVMAI PAI LI GAI GASTDW 300 

MINI | I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 241 LDS FLLLMLGGI PWQAYFQRVLS SS SATYAQVLS FLAAFGCLVMAI PAI LI GAI GASTDW 300 



301 NQTAYGLPDPKTTEEADMI LPI VLQYLCPVYI S FFGLGAVSAAVMS SADS S ILSAS SMFA 360 
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Db 


541 



1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 

NQTAYGLPDPKTTEEADMILPIVLQYLCPVYISFFGLGAVSAAVMSSADSSILSASSMFA 360 

RNIYQLSFRQNASDKEIVWVMRITVFVFGASATA^^ALLTKTVYGLWYLSSDLVYIVIFPQ 42 0 

| | | | | | | | | | | | | | I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M M 

RN I YQL S FRQNAS DKE I VWVMRI T VFVFGAS AT AMALLT KT VYGLW YL S S DLVYI VI F PQ 420 

LLCVLFVKGTNTYGAVAGYVSGLFLRITGGEPYLYLQPLIFYPGYYPDDNGIYNQKFPFK 480 

| | | | | | | I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

LLCVLFVKGTNTYGAVAGYVSGLFLRITGGEPYLYLQPLIFYPGYYPDDNGIYNQKFPFK 480 

TLAMVT S FLTNI CI S YLAKYLFESGTLPPKLDVFDAWARHSEENMDKTI LVKNENI KLD 54 0 
| | I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
TLA1WTSFLTNICISYLAKYLFESGTLPPKLDVFDAVVARHSEENMDKTI LVKNENI KLD 54 0 

ELALVKPRQSMTLSSTFTNKEAFLDVDSSPEGSGTEDNLQ 58 0 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I 
ELALVKPRQSMTLS STFTNKEAFLDVDS S PEGSGTEDNLQ 580 



RESULT 2 


AAB86837 


ID 


AAB86837 standard; protein; 580 AA. 


XX 




AC 


AAB86837; 


XX 




DT 


26-NOV-2001 (first entry) 


XX 




DE 


Human CHOT protein. 


XX 




KW 


CHOT; human; choline transporter; chromosome 2qll-13; nootropic; 


KW 


neuroprotective; gene therapy; antisense therapy; degenerative disease; 


KW 


cognitive disorder; Alzheimer's disease. 


XX 




OS 


Homo sapiens. 


XX 




PN 


DE10009055-A1. 


XX 




PD 


30-AUG-2001. 


XX 




PF 


28-FEB-2000; 2000DE-01009055 . 


XX 




PR 


28-FEB-2000; 2000DE-01009055 . 


XX 




PA 


(BRUE/) BRUESS M. 


PA 


(BOEN/) BOENISCH H. 


XX 




PI 


Bruess M, Boenisch H; 


XX 




DR 


WPI; 2001-590709/67. 


DR 


N-PSDB; AAH49207. 


XX 




PT 


A new gene encoding human choline transporter, designated hCHOT is 


PT 


located on chromosome 2qll-13 and is useful to treat degenerative 


PT 


disorders such as Alzheimer's disease. 


XX 




PS 


Disclosure; Page 11; 12pp; German. 



This invention describes a novel gene encoding human choline transporter, 
designated hCHOT which is located on chromosome 2qll-13. The products of 
the invention have nootropic and neuroprotective activity and can be used 
for gene or antisense therapy. (I) is used to treat degenerative disease, 
particularly cognitive disorders such as Alzheimer's disease. Sense and 
antisense oligonucleotides derived from the gene may be used in 
diagnostics and other techniques. This sequence represents the human CHOT 
protein described in the invention 

Sequence 580 AA; 

Query Match 100.0%; Score 2972; DB 4; Length 580; 

Best Local Similarity 100.0%; Pred. No. 3.6e-290; 

Matches 580; Conservative 0; Mismatches 0; Indels 0; Gaps 0; 



Qy 


1 


MAFHVEGLIAI IVFYLLILLVGIWAAWRTKNSGSAEERSEAI IVGGRDIGLLVGGFTMTA 


60 




1 1 1 I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 
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MAFHVEGLIAI IVFYLLILLVGIWAAWRTKNSGSAEERSEAI IVGGRDIGLLVGGFTMTA 


60 
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61 


TWVGGGYINGTAEAVYVPGYGLAWAQAPIGYSLSLILGGLFFAKPMRSKGYVTMLDPFQQ 


120 




1 1 1 1 1 i 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


61 


TWVGGGYI NGTAEAVYVP G YGLAWAQAP I G YS L S L I LGGL FFAK PMRS KG YVTMLD P FQQ 


120 


Qy 
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I YGKRMGGLLFI PALMGEMFWAAAI F SAL GAT I S VI I DVDMH I SVI I SALIATLYTLVGG 


180 




1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 II 1 1 I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 
i | i i i i i i \ \ i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i * i i i i i i i i > ' ■ ' ■ 




Db 


121 


I YGKRMGGLLFI PALMGEMFWAAAI FSALGATI SVI I DVDMHI SVI I SALIATLYTLVGG 


180 


Qy 


181 


L YS VAYT D WQL FC I FVGLWI S VP FAL S H P AVAD I G FT AVH AK YQ K PWL GT VD S S EVY S W 


240 




1 i 1 1 1 j i M 1 1 1 1 1 1 1 1 1 1 II 1 II 1 1 1 1 1 1 1 1 1 1 M 1 II 1 1 1 1 1 1 II 1 1 ! 1 1 1 1 1 1 M 1 > 




Db 


181 


LYSVAYTDVVQLFCI FVGLWI SVPFALSHPAVAI)IGFTAVHAKYQKPWLGTVDSSEVYSW 


240 


Qy 


241 


LDS FLLLMLGGI PWQAYFQRVLS S S SAT YAQVLS FLAAFGCLVMAI PAI LI GAIGASTDW 


300 




i ■ i ■ • i i i ■ i i i i i i i i i i i i t i i i i i i i i i i i i i i i i i i i i i i i 

1 M 1 1 II 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 i 1 1 1 1 1 1 1 1 1 1 1 M 1 1 




Db 


241 


LDS FLLLMLGGI PWQAYFQRVLS SS SAT YAQVLS FLAAFGCLVMAI PAI LI GAIGASTDW 


300 


Qy 


301 


NQTAYGLPDPKTTEEADMI LPI VLQYLCPVYI S FFGLGAVSAAVMS SADS S ILSAS SMFA 


360 




| | | | I I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


301 


NQTAYGLPDPKTTEEADMILPIVLQYLCPVYISFFGLGAVSAAVMSSADSSILSASSMFA 


360 


Qy 


361 


RNIYQLSFRQNASDKEIVWMRITVFVFGASATAMALLTKTVYGLWYLSSDLVYIVIFPQ 


420 




| | | | I I I I I I I I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 I 1 1 1 1 1 1 1 1 1 




Db 


361 


RNIYQLSFRQNASDKEIVWMRITVFVFGASATAMALLTKTVYGLWYLSSDLVYIVIFPQ 


420 


Qy 


421 


LLCVLFVKGTNTYGAVAGYVSGLFLRITGGEPYLYLQPLIFYPGYYPDDNGIYNQKFPFK 


480 




| | | I I I I I I I I I I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


421 


LLCVLFVKGTNTYGAVAGYVSGLFLRITGGEPYLYLQPLIFYPGYYPDDNGIYNQKFPFK 


480 


Qy 


481 


T LAMVT S FLTN I C I S YLAKYL FE S GT LP P KLDVFDAWARH S EENMDKT I LVKN EN I KLD 


540 




| | | | | | I I I I I I I I I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


481 


TLAMVTSFLTNICISYI^YLFESGTLPPKLDVFDAWARHSEENMDKTILVKNENIKLD 


540 


Qy 


541 


ELALVKPRQSMTLSSTFTNKEAFLDVDSSPEGSGTEDNLQ 580 




Db 


541 


1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 M 1 M 1 1 1 1 1 1 1 1 1 1 1 1 I 1 1 1 

ELALVKPRQSMTLSSTFTNKEAFLDVDSSPEGSGTEDNLQ 580 





XX 
CC 
CC 
CC 
CC 
CC 
CC 
CC 
CC 
XX 
SQ 



RESULT 3 
ABU08979 

ID ABU08979 standard; protein; 580 AA. 
XX 

AC ABU08979; 
XX 

DT 13-JUN-2003 (first entry) 
XX 

DE Human high affinity choline transporter, HACT. 
XX 

KW Human; HACT; high affinity choline transporter; pain; 

KW neurotransmitter biosynthesis; learning and memory; aging; epilepsy; 

KW neurological disorder; spasticity; myoclonus; muscle spasm; 

KW muscle hyperactivity; stroke; head trauma; neuronal cell death; 

KW multiple sclerosis; spinal chord injury; dystonia; Alzheimer's disease; 

KW Myasthenia Gravis; multi-inf arct dementia; AIDS dementia; 

KW Parkinson's disease; Huntington's disease; amyotrophic lateral sclerosis; 

KW ALS; attention deficit disorder; organic brain syndrome; schizophrenia; 

KW nicotine addiction; memory disorder; cognitive disorder. 
XX 

OS Homo sapiens. 
XX 

PN US6500643-B1. 
XX 

PD 31-DEC-2002. 
XX 

PF 07-SEP-2000; 2000US-00657252 . 
XX 

PR 07-SEP-2000; 2000US-00657252 . 
XX 

PA (UYFL ) UNIV FLORIDA. 
XX 

PI Wu D, Gu Y, Millard WJ, He Y; 
XX 

DR WPI; 2003-361535/34. 

DR N-PSDB; ABX94338. 
XX 

PT Novel isolated polynucleotide (I) that encodes high affinity choline 

PT transporter protein, useful for preventing, treating or ameliorating 

PT neurological and cognitive disorders such as Alzheimer's or Parkinson's 

PT disease. 
XX 

PS Claim 1; Col 21-24; 20pp; English. 
XX 

CC The invention relates to an isolated polynucleotide which encodes a high 

CC affinity choline transporter (HACT) protein appearing as ABU08979. Also 

CC included are a polynucleotide encoding a fragment consisting of at least 

CC about 50 amino acids of the HACT protein, a vector comprising the 

CC polynucleotide, a composition comprising a vector comprising a 

CC polynucleotide which comprises at least about 12 contiguous nucleic acids 

CC of a polynucleotide appearing as ABX94339 (encoding choline 

CC acetyltransferase) , a recombinant host cell which comprises the vector 

CC (used to express the HACT protein or fragment) . The polynucleotide is 

CC useful as a probe or primer to detect the presence of HACT polynucleotide 

CC in a sample, such as a biological sample, or for screening for test 

CC agents which bind to the polynucleotide. A pharmaceutical composition 

CC comprising the polynucleotide is useful for preventing, treating or 



CC ameliorating neurological and cognitive disorders e.g. pain, spasticity, 

CC myoclonus, muscle spasm, muscle hyperactivity, epilepsy, stroke, head 

CC trauma, neuronal cell death, multiple sclerosis, spinal chord injury, 

CC dystonia, Alzheimer's disease, myasthenia gravis, multi- infarct 

CC dementia, AIDS dementia, Parkinson's disease, Huntington's disease, 

CC amyotrophic lateral sclerosis (ALS) , attention deficit disorder, nicotine 

CC addiction, organic brain syndromes, schizophrenia or memory and cognitive 

CC disorders. HACT is thought to be the rate limiting step in cholinergic 

CC neurotransmitter biosynthesis and regeneration (cholinergic transmissions 

CC are crucial to brain functions such as learning and memory) . The present 

CC sequence represents human HACT 
XX 

SQ Sequence 580 AA; 

Query Match 100.0%; Score 2972; DB 6; Length 580; 

Best Local Similarity 100.0%; Pred. No. 3.6e-290; 

Matches 58 0; Conservative 0; Mismatches 0; Indels 0; Gaps 0; 
Qy l MAFHVEGLIAI IVFYLLI LLVGIWAAWRTKNSGSAEERSEAI I VGGRDI GLLVGGFTMTA 60 

I | | | | | | | | | I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I 

Db l MAFHVEGLIAI IVFYLLI LLVGIWAAWRTKNS GSAEERSEAI I VGGRDI GLLVGGFTMTA 60 

Qy -61 TWVGGGYINGTAEAVYVPGYGLAWAQAPIGYSLSLILGGLFFAKPMRSKGYVTMLDPFQQ 120 

| | | | | | | | | | | | | | | I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

D b 61 TWVGGGYINGTAEAVYVPGYGLAWAQAPIGYSLSLILGGLFFAKPMRSKGYVTMLDPFQQ 120 

Qy 121 I YGKRMGGLLFI PALMGEMFWAAAI FSALGATI SVI I DVDMHI SVI I SALI ATLYTLVGG 180 

| | | | | | I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 121 I YGKRMGGLLFI PALMGEMFWAAAI FSALGATI SVI I DVDMHI SVI I SAL I ATLYTLVGG 180 

Q y 181 LYSVAYTDVVQLFCIFVGLWISVPFALSHPAVAJDIGFTAVIIAKYQKPWLGTVDSSEVYSW 240 

| | | | | | I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I 

Db 181 L Y S VAYT DWQ L FC I FVGLWI S VP FAL S H PAVADI G FT AVHAK YQ K PW L GT VD S S EVY S W 240 

Q y 241 LDS FLLLMLGGI PWQAYFQRVLS S S SAT YAQVLS FLAAFGCLVMAI PAI LI GAI GASTDW 300 

| | | | | | | | | | | | | | I II I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I 

Db 241 LDS FL LLML GG I PWQAY FQ RVL S S S SAT YAQ VL S FLAAFGCLVMAI PAI L I GAI GAS T DW 300 

Qy 301 NQTAYGLPDPKTTEEADMI LPIVLQYLCPVYI S FFGLGAVSAAVMS SADS S I LSAS SMFA 360 

| | | | | | | | | | I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 301 NQTAYGLPDPKTTEEADMI LP I VLQYLCPVYI S FFGLGAVSAAVMS SADSS I LSAS SMFA 360 

Qy 361 RNIYQLSFRQNASDKEIV1/WMRITVFVFGASA 420 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 361 RNIYQLSFRQNASDKEIVWVMRITVFVFGASATAMALLTKTVYGLWYLSSDLVYIVIFPQ 420 

Q y 421 LLCVLFVKGTNTYGAVAGYVSGLFLRITGGEPYLYLQPLIFYPGYYPDDNGIYNQKFPFK 480 

| | | | | | I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 421 LLCVLFVKGTNTYGAVAGYVSGLFLRITGGEPYLYLQPLIFYPGYYPDDNGIYNQKFPFK 480 

Qy 481 TLAMVTSFLTNICISYLAKYLFESGTLPPKLDVFDAWARHSEENMDKTILVKNENIKLD 540 

| | | | | | | I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I II I I I 
Db 481 T LAMVT S FLTN I C I S YLAKYL FE S GT L P P KL DVFDAWARH S E ENMDKT I LVKNEN I KLD 540 

Qy 541 ELALVKPRQSMTLSSTFTNKEAFLDVDSSPEGSGTEDNLQ 580 

| I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 541 ELALVKPRQSMTLSSTFTNKEAFLDVDSSPEGSGTEDNLQ 580 



RESULT 4 
ADD50649 

ID ADD50649 standard; protein; 580 AA. 
XX 

AC ADD50649; 
XX 

DT 15-JAN-2004 (first entry) 
XX 

DE High-affinity choline transporter (CHT) associated protein sequence #3. 
XX 

KW High-affinity choline transporter; CHT; cholinergic function; 

KW Parkinson's disease; Huntington f s disease; Alzheimer 1 s disease; 

KW schizophrenia; dysautonomia ; myasthenia gravis; brain; 

KW cholinergic signalling; antiparkinsonian; anticonvulsant; nootropic; 

KW neuroprotective; neuroleptic. 

XX 

OS Unidentified. 
XX 

PN US2003114399-A1. 
XX 

PD 19-JUN-2003. 
XX 

PF 23-JUL-2001; 2001US-00911077 . 
XX 

PR 23-JUL-2001; 2001US-00911077 . 
XX 

PA (BLAK/) BLAKELY R D. 

PA (APPA/) AP PARS UN D ARAM S. 

PA (FERG/) FERGUSON S. 

XX 

PI Blakely RD, Apparsundaram S, Ferguson S; 
XX 

DR WPI; 2003-810914/76. 
XX 

PT Novel isolated polynucleotide encoding human or mouse high affinity 

PT choline transporter polypeptide, useful in gene therapy to increase 

PT cholinergic function in a cell of a patient suffering from Alzheimer's 

PT disease. 
XX 

PS Disclosure; SEQ ID NO 12; 74pp; English. 
XX 

CC The present invention relates to the isolation of polynucleotide 

CC sequences encoding human and mouse high-affinity choline transporter 

CC (hCHT and mCHT respectively), and the proteins they encode. The gene 

CC encoding hCHT is located on chromosome 2ql2. The polynucleotide sequence 

CC encoding hCHT, is useful for expressing hCHT recombinantly . The hCHT 

CC polynucleotide sequence when delivered to a cell, increases cholinergic 

CC function in the cell that is in a patient having Parkinson's disease, 

CC Huntington's disease, Alzheimer's disease, schizophrenia, dysautonomia or 

CC myasthenia gravis. The hCHT antibody is useful for controlling 

CC transporter CHT proteins to the brain, and for treating the above 

CC mentioned diseases. The antibody is also useful for diagnosing the above 

CC mentioned disorders and to detect the influence of cholinergic 

CC signalling. The present protein sequence of unknown function is provided 

CC in the electronic sequence data but is not mentioned in the printed 



CC specification. Note: The sequence data for this patent was obtained in 

CC electronic format directly from the USPTO web site at seqdata.uspto.gov. 
XX 

SQ Sequence 580 AA; 

Query Match 100.0%; Score 2972; DB 7; Length 580; 

Best Local Similarity 100.0%; Pred. No. 3.6e-290; 

Matches 580; Conservative 0; Mismatches 0; Indels 0; Gaps 0; 

Qy 1 MAFHVEGLIAIIVFYLLILLVGIWAAWRTKNSGSAEERSEAIIVGGRDIGLLVGGFTMTA 60 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II 

Db 1 MAFHVEGLIAIIVFYLLILLVGIWAAWRTKNSGS7VEERSEAIIVGGRDIGLLVGGFTMTA 60 

Qy 61 TWGGGYINGTAEAVYVPGYGLAWAQAPIGYSLSLILGGLFFAKPMRSKGYVTMLDPFQQ 120 

II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I II I I I 1 I I 

Db 61 TWVGGG Y I NGTAEAVYVP G YGLAWAQAP IGYSLSLILGGLF FAK PMR S KG YVTML D P FQQ 120 

Qy 121 I YGKRMGGLLFI PALMGEMFWAAAI FSALGATI SVI I DVDMHI SVI I SALI ATLYTLVGG 180 

I I II I I I I II I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 121 I YGKRMGGLLFI PALMGEMFWAAAI FSALGATI SVI I DVDMHI SVI I SALI ATLYTLVGG 180 

Qy 181 L Y S VAYT DWQ L FC I FVGLW I S VP FAL S H P AVAD I G FT AVHAKYQ K PWLGT VD S S EVY S W 240 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I 
Db 181 L Y S VAYT DWQ L FC I FVGLWI S VP FAL S H P AVAD I G FT AVHAK YQKP WLGT VD S S EVYS W 240 

Qy 241 LD S FLLLMLGGI PWQAYFQRVL S S S SAT YAQVL S FLAAFGC LVMAI PAI L I GAI GASTDW 300 

I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 241 L D S FLL LML GG I PWQAY FQ RVL S S S SAT YAQVL S F LAAFGC LVMAI PAI L I GAI GAS T DW 300 

Qy 301 NQTAYGLPDPKTTEEADMILPIVLQYLCPVYISFFGLGAVSAAVMSSADSSILSASSMFA 360 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I 
Db 301 NQTAYGLPDPKTTEEADMI LPI VLQYLC PVYI S FFGLGAVSAAVMS SADS S ILS AS SMFA 360 

Qy 361 RNIYQLSFRQNASDKEIVWMRITVFVFGASATAMALLTKTVYGLWYLSSDLVYIVIFPQ 420 

I II I I I I I I I I I I I I II I I I II I I I M I I I I I II I I I II I II I I I II I I I I I I I I I I I I I 

Db 361 RNIYQLSFRQNASDKEIWVMRITVFVFGASATAMALLTKTVTGLWYLSSDLVTIVIFPQ 420 

Qy 421 LLCVLFVKGTNTYGAVAGYVSGLFLRITGGEPYLYLQPLIFYPGYYPDDNGIYNQKFPFK 4 80 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 421 LLCVLFVKGTNTYGAVAGYVSGLFLRITGGEPYLYLQPLIFYPGYYPDDNGIYNQKFPFK 480 

Qy 481 TLAMVTSFLTNICISYLAKYLFESGTLPPKLDVFDAWARHSEENMDKTILVKNENIKLD 540 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 481 TL7\MVTSFLTNICISYL7^YLFESGTLPPKLDVFDAWARHSEENMDKTILVKNENIKLD 540 

Qy 541 ELALVKPRQSMTLS STFTNKEAFLDVDS S PEGSGTEDNLQ 580 

I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 541 ELALVKPRQSMTLS STFTNKEAFLDVDS S PEGSGTEDNLQ 580 



RESULT 5 
ADD50639 

ID ADD50639 standard; protein; 580 AA. 
XX 

AC ADD50639; 
XX 

DT 15-JAN-2004 (first entry) 



XX 

DE Human high-affinity choline transporter (hCHT) . 
XX 

KW Human; high-affinity choline transporter; hCHT; cholinergic function; 

KW Parkinson's disease; Huntington 1 s disease; Alzheimer's disease; 

KW schizophrenia; dysautonomia ; myasthenia gravis; brain; 

KW cholinergic signalling; antiparkinsonian; anticonvulsant; nootropic; 

KW neuroprotective; neuroleptic. 

XX 

OS Homo sapiens . 
XX 

PN US2003114399-A1. 
XX 

PD 19-JUN-2003. 
XX 

PF 23-JUL-2001; 2001US-00911077 . 
XX 

PR 23-JUL-2001; 2001US-00911077 . 
XX 

PA (BLAK/) BLAKELY R D. 

PA (APPA/) APPARSUNDARAM S. 

PA (FERG/) FERGUSON S. 

XX 

PI Blakely RD, Apparsundaram S, Ferguson S; 
XX 

DR WPI; 2003-810914/76. 

DR N-PSDB; ADD50638. 

-XX 

PT Novel isolated polynucleotide encoding human or mouse high affinity 

PT choline transporter polypeptide, useful in gene therapy to increase 

PT cholinergic function in a cell of a patient suffering from Alzheimer's 

PT disease. 
XX 

PS Claim 1; SEQ ID NO 2; 74pp; English. 
XX 

CC The present invention relates to the isolation of polynucleotide 

CC sequences encoding human and mouse high-affinity choline transporter 

CC (hCHT and mCHT respectively), and the proteins they encode. The gene 

CC encoding hCHT is located on chromosome 2ql2. The polynucleotide sequence 

CC encoding hCHT, is useful for expressing hCHT recombinantly . The hCHT 

CC polynucleotide sequence when delivered to a cell, increases cholinergic 

CC function in the cell that is in a patient having Parkinson's disease, 

CC Huntington's disease, Alzheimer's disease, schizophrenia, dysautonomia or 

CC myasthenia gravis. The hCHT antibody is useful for controlling 

CC transporter CHT proteins to the brain, and for treating the above 

CC mentioned diseases. The antibody is also useful for diagnosing the above 

CC mentioned disorders and to detect the influence of cholinergic 

CC signalling. The present sequence represents hCHT . Note: The sequence data 

CC for this patent was obtained in electronic format directly from the USPTO 

CC web site at seqdata.uspto.gov. 

XX 

SQ Sequence 580 AA; 



Query Match 100.0%; Score 2972; DB 7; Length 580; 

Best Local Similarity 100.0%; Pred. No. 3.6e-290; 

Matches 580; Conservative 0; Mismatches 0; Indels 0; Gaps 



0; 



Qy 


1 


MAFHVEGLIAIIVFYLLILLVGIWAAWRTKNSGSAEERSEAIIVGGRDIGLLVGGFTMTA 


DU 




I | | | | | | | | | | 1 1 1 1 1 1 1 1 1 1 1 1 i 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 M 1 1 1 M 1 1 1 1 1 1 1 1 




Db 


1 


MAFHVEGLIAIIVFYLLILLVGIWAAWRTKNSGSAEERSEAIIVGGRDIGLLVGGFTMTA 


60 


Qy 


61 


T WVGGG Y I N GT AEAVYVP G Y G LAWAQAP I G Y S L S L I L GGL F FAK PMRS KG YVTML D P FQQ 






I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 




Db 


61 


TWVGGGYINGTAEAVYVPGYGLAWAQAPIGYSLSLILGGLFFAKPMRSKGYVTMLDPFQQ 


120 


Qy 


121 


I YGKRMGGLLFI PALMGEMFWAAAI FSALGATI SVI I DVDMHI SVI I SALI ATLYTLVGG 


i a n 

loU 




| | | 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


121 


I YGKRMGGLLFI PALMGEMFWAAAI FSALGATI SVI I DVDMHI SVI I SALI ATLYTLVGG 


180 


Qy 


181 


LYSVAYTDWQLFCIFVGLWISVPFALSHPAVAI)IGFTAVHAKYQKPWLG1 VDbbkV ibw 


9 a n 




| | | | I I I I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


181 


LYSVAYTDWQLFCIFVGLWISVPFALSHPAVADIGFTAVHAKYQKPWLGTVDSSEVYSW 


240 


Qy 


241 


LDS FLLLMLGGI PWQAYFQRVLS S S SAT YAQVLS FLAAFGCLVMAI PAI LI GAI GASTDW 


jUU 




M 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 II 1 1 1 1 1 1 1 1 




Db 


241 


LDS FLLLMLGGI PWQAYFQRVLS S S SAT YAQVLS FLAAFGCLVMAI PAI LI GAI GASTDW 


300 


Qy 


301 


NQTAYGLPDPKTTEEADMI LPIVLQYLCPVYI S FFGLGAVSAAVMS SADS S I LSASSMFA 


q £ n 




| | | | | | | | | | | I I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II M 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


301 


NQTAYGLPDPKTTEEADMI LPIVLQYLCPVYI S FFGLGAVSAAVMS SADSS I LSASSMFA 


360 


Qy 


361 


RNIYQLSFRQNASDKEIVWVMRITVFVFGASATAMALLTKTVYGLWYLSSDLVYIVIFPQ 


/ion 




| | | | | | I I I I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


361 


RNI YQLS FRQNAS DKEI VWVMRI TVFVFGAS ATAMALLTKTVYGLWYLS S DLVYI VI FPQ 


420 


Qy 


421 


LLCVLFVKGTNTYGAVAGYVSGLFLRITGGEPYLYLQPLIFYPGYYPDDNGIYNQKFPFK 


a q n 
4 o U 




| | | 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


421 


LLCVLFVKGTNTYGAVAGYVSGLFLRITGGEPYLYLQPLIFYPGYYPDDNGIYNQKFPFK 


480 


Qy 


481 


TLAMVTSFLTNI CI S YLAKYLFESGTLPPKLDVFDAVVARHSEENMDK1 1 LVKJn *mn 1 rv-LD 


D *i U 




■ ■■ill i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i ) i i i i t i i i i i 

M 1 1 1 1 1 1 II 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 II 




Db 


481 


TLAMVTSFLTNICISYLAKYLFESGTLPPKLDVFDAWARHSEENMDKTILVKNENIKLD 


540 


Qy 


541 


ELALVKPRQSMTLSSTFTNKEAFLDVDSSPEGSGTEDNLQ 58 0 






I I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


541 


ELALVKPRQSMTLSSTFTNKEAFLDVDSSPEGSGTEDNLQ 580 





RESULT 6 
ADD50648 

ID ADD50648 standard; protein; 580 AA. 
XX 

AC ADD50648; 
XX 

DT 15-JAN-2004 (first entry) 
XX 

DE High-affinity choline transporter (CHT) associated protein sequence #2. 
XX 

KW High-affinity choline transporter; CHT; cholinergic function; 

KW Parkinson ! s disease; Huntington f s disease; Alzheimer's disease; 

KW schizophrenia; dysautonomia; myasthenia gravis; brain; 

KW cholinergic signalling; antiparkinsonian; anticonvulsant; nootropic; 

KW neuroprotective; neuroleptic. 

XX 



OS Unidentified. 
XX 

PN US2003114399-A1. 
XX 

PD 19-JUN-2003. 
XX 

PF 23-JUL-2001; 2001US-00911077 . 
XX 

PR 23-JUL-2001; 2001US-00911077 . 
XX 

PA (BLAK/) BLAKELY R D. 

PA (APPA/) APPARSUNDARAM S. 

PA (FERG/) FERGUSON S. 

XX 

PI Blakely RD, Apparsundaram S, Ferguson S; 
XX 

DR WPI; 2003-810914/76. 
XX 

PT Novel isolated polynucleotide encoding human or mouse high affinity 

PT choline transporter polypeptide, useful in gene therapy to increase 

PT cholinergic function in a cell of a patient suffering from Alzheimer's 

PT disease. 
XX 

PS Disclosure; SEQ ID NO 11; 74pp; English. 
XX 

CC The present invention relates to the isolation of polynucleotide 

CC sequences encoding human and mouse high-affinity choline transporter 

CC (hCHT and mCHT respectively), and the proteins they encode. The gene 

CC encoding hCHT is located on chromosome 2ql2. The polynucleotide sequence 

CC encoding hCHT f is useful for expressing hCHT recombinantly . The hCHT 

CC polynucleotide sequence when delivered to a cell, increases cholinergic 

CC function in the cell that is in a patient having Parkinson's disease, 

CC Huntington's disease, Alzheimer's disease, schizophrenia, dysautonomia o. 

CC myasthenia gravis. The hCHT antibody is useful for controlling 

CC transporter CHT proteins to the brain, and for treating the above 

CC mentioned diseases. The antibody is also useful for diagnosing the above 

CC mentioned disorders and to detect the influence of cholinergic 

CC signalling. The present protein sequence of unknown function is provided 

CC in the electronic sequence data but is not mentioned in the printed 

CC specification. Note: The sequence data for this patent was obtained in 

CC electronic format directly from the USPTO web site at seqdata.uspto.gov. 

XX 

SQ Sequence 580 AA; 

Query Match 100.0%; Score 2972; DB 7; Length 580; 
Best Local Similarity 100.0%; Pred. No. 3.6e-290; 

Matches 580; Conservative 0; Mismatches 0; Indels 0; Gaps 



QY 



1 MAFHVEGLIAIIVFYLLILLVGIWAAWRTKNSGSAEERSEAIIVGGRDIGLLVGGFTMTA 60 




Db 



1 MAFHVEGLIAIIVFYLLILLVGIWAAWRTKNSGSAEERSEAIIVGGRDIGLLVGGFTMTA 60 



Qy 



Db 



61 TWVGGGYINGTAEAVYVPGYGLAWAQAPIGYSLSLILGGLFFAKPMRSKGYVTMLDPFQQ 120 

| | | | | | | | | | | | || | I I I I I I I I I I I II i I I I I I I I I I II I I I I I II I I I I I I I I I I I I I 
61 T WGGGYI NGTAEAVYVPGYGLAWAQAP I GYS LS L I LGGLFFAKPMRS KGYVTMLDP FQQ 120 



Qy 



121 I YGKRMGGLLFI PALMGEMFWAAAI F SAL GAT I S VI I DVDMHI SVI I S ALI ATLYTLVGG 180 



121 I YGKRMGGLLFI PALMGEMFWAAAI FSALGATI SVIIDVDMHI SVI I SALIATLYTLVGG 180 
181 LYSVAYTDWQLFCIFVGLWISVPFALSHPAVADIGFTAVHAKYQKPWLGTVDSSEVYSW 240 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I II I I I I I I I I I I 

181 LYSVAYTDWQLFCI FVGLWI SVPFALSHPAVADIGFTAVHAKYQKPWLGTVDSSEVYSW 240 

241 L D S FL L LMLGG I PWQ AY FQ RVL S S S SAT YAQVL S FLAAFGC LVMAI P AI L I GAI GAST DW 300 

I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I IN I N 

241 LDS FLLLMLGGI PWQAYFQRVLS S S SAT YAQVL S FLAAFGC LVMAI PAI LI GAI GASTDW 300 

301 NQTAYGLPDPKTTEEADMILPIVLQYLCPVYISFFGLGAVSAAVMSSADSSILSASSMFA 360 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I M I M I I I I M M I N 

301 NQTAYGLPDPKTTEEADMILPIVLQYLCPVYISFFGLGAVSAAVMSSADSSILSASSMFA 360 
361 RN I YQLS FRQNAS D KEI VWVMRI TVFVFGAS ATAMALLTKTVYGLW YL S S DLVYI VI FPQ 420 

I I I I 1 1 1 I I 1 1 1 1 1 I 1 1 I 1 1 I I I I 1 1 I 1 1 1 1 1 I I 1 1 1 I I I 1 1 1 1 1 1 I 1 1 1 1 i 1 1 1 1 1 1 1 I 

361 RNIYQLS FRQNAS DKEIVWVMRITVFVFGASATAMALLTKTVYGLWYLSSDLVYI VI FPQ 420 

421 LLCVLFVKGTNTYGAVAGYVSGLFLRITGGEPYLYLQPLI FYPGYYPDDNGI YNQKFPFK 4 80 

| | | I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
421 LLCVLFVKGTNTYGAVAGYVSGLFLRITGGEPYLYLQPLI FYPGYYPDDNGI YNQKFPFK 4 80 

481 TLAMVTSFLTNICISYLAKYLFESGTLPPKLDVFDAWARHSEENMDKTILVKNENIKLD 540 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
481 TLAMVTSFLTNICISYLAKYLFESGTLPPKLDVFDAW7VRHSEENMDKTILVKNENIKLD 540 

541 ELALVKPRQSMTLSSTFTNKEAFLDVDSSPEGSGTEDNLQ 58 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
541 ELALVKPRQSMTLSSTFTNKEAFLDVDSSPEGSGTEDNLQ 58 0 



RESULT 7 


ADD50647 


ID 


ADD50647 standard; protein; 580 AA. 


XX 




AC 


ADD50647; 


XX 




DT 


15-JAN-2004 (first entry) 


XX 


High-affinity choline transporter (CHT) associated protein sequence #1. 


DE 


XX 




KW 


High-affinity choline transporter; CHT; cholinergic function; 


KW 


Parkinson 1 s disease; Huntington's disease; Alzheimer 1 s disease; 


KW 


schizophrenia; dysautonomia ; myasthenia gravis; brain; 


KW 


cholinergic signalling; antiparkinsonian; anticonvulsant; nootropics- 


KW 


neuroprotective; neuroleptic. 


XX 




OS 


Unidentified. 


XX 




PN 


US2003114399-A1. 


XX 




PD 


19-JUN-2003. 


XX 




PF 


23-JUL-2001; 2001US-00911077. 


XX 




PR 


23-JUL-2001; 2001US-00911077 . 



Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

QY 
Db 

Qy 

Db 

Qy 

Db 



XX 

PA (BLAK/) BLAKELY R D. 

PA (APPA/) APPARSUNDARAM S. 

PA (FERG/) FERGUSON S. 

XX 

PI Blakely RD, Apparsundaram S, Ferguson S; 
XX 

DR WPI; 2003-810914/76. 
XX 

PT Novel isolated polynucleotide encoding human or mouse high affinity 

PT choline transporter polypeptide, useful in gene therapy to increase 

PT cholinergic function in a cell of a patient suffering from Alzheimer's 

PT disease. 
XX 

PS Disclosure; SEQ ID NO 10; 74pp; English. 
XX 

CC The present invention relates to the isolation of polynucleotide 

CC seguences encoding human and mouse high-affinity choline transporter 

CC (hCHT and mCHT respectively), and the proteins they encode. The gene 

CC encoding hCHT is located on chromosome 2ql2. The polynucleotide sequence 

CC encoding hCHT, is useful for expressing hCHT recombinantly . The hCHT 

CC polynucleotide sequence when delivered to a cell, increases cholinergic 

CC function in the cell that is in a patient having Parkinson's disease, 

CC Huntington's disease, Alzheimer's disease, schizophrenia, dysautonomia or 

CC myasthenia gravis. The hCHT antibody is useful for controlling 

CC transporter CHT proteins to the brain, and for treating the above 

CC mentioned diseases. The antibody is also useful for diagnosing the above 

CC mentioned disorders and to detect the influence of cholinergic 

CC signalling. The present protein sequence of unknown function is provided 

CC in the electronic sequence data but is not mentioned in the printed 

CC specification. Note: The sequence data for this patent was obtained in 

CC electronic format directly from the USPTO web site at seqdata.uspto.gov. 

XX 

SQ Sequence 580 AA; 

Query Match 100.0%; Score 2972; DB 7; Length 580; 

Best Local Similarity 100.0%; Pred. No. 3.6e-290; 

Matches 580; Conservative 0; Mismatches 0; Indels 0; Gaps 0; 



Qy 


1 


MAFHVEGLIAIIVFYLLILLVGIWAAWRTKNSGSAEERSEAIIVGGRDIGLLVGGFTMTA 


60 




I i I I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 I 1 1 1 1 1 1 1 1 1 




Db 


1 


MAFHVEGLIAIIVFYLLILLVGIWAAWRTKNSGSAEERSEAIIVGGRDIGLLVGGFTMTA 


60 


Qy 


61 


T WGGGYINGTAEAVYVPGYGLAWAQAP I GYS L S LI LGGLFFAKPMRS KG YVTMLDP FQQ 


120 




I I I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


61 


TWVGGGYINGTAEAVYVPGYGLAWAQAPIGYSLSLILGGLFFAKPMRSKGYVTMLDPFQQ 


120 


Qy 


121 


I YGKRMGGLLFI PALMGEMFWAAAI F SAL GAT IS VI I DVDMHI SVI I SALIATLYTLVGG 


180 




I I I I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


121 


I YGKRMGGLLFI PALMGEMFWAAAI FSALGATI SVI IDVDMHI SVI I SALIATLYTLVGG 


180 


Qy 


181 


LYSVAYTDWQLFCIFVGLWISVPFALSHPAVADIGFTAVHAKYQKPWLGTVDSSEVYSW 


240 




I M | I I I || || I I 1 1 1 1 II 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


181 


L Y S VAYT DWQ L FC I FVG LW I S VP FAL S H PAVAD I G FT AVHAK YQ K PWL GT VD S S EVY S W 


240 



Qy 



241 LDSFLLLMLGGIPWQAYFQRVLSSSSATYAQVLSFLAAFGCLVMAIPAILIGAIGASTDW 300 
| | | I | I I I I M I I I I I I I I I I I I M I I I I I II I I I I I I I I II I I I I I I I I I I I I I I I I I I 



ft 



Db 


241 


Qy 


301 


Db 


301 


Qy 


361 


Db 


361 


Qy 


421 


Db 


421 


Qy 


481 


Db 


481 


Qy 


541 


Db 


541 



LDS FLLLMLGGI PWQAYFQRVLS S S SAT YAQVL S FLAAFG C LVMAI PAI LI GAI GAST DW 300 

360 
360 



NQTAYGLPDPKTTEEADMILPIVLQYLCPVYISFFGLGAVSAAVMSSADSSILSASSMFA 

I I I I I I I I I II I I I I I I I II I I I I I I I I I I I I I I I M I I I I M I I I 

NQTAYGLPDPKTTEEADMILPIVLQYLCPVYISFFGLGAVSAAVMSSADSSILSASSMFA 



| | | I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 



LLCVLFVKGTNTYGAVAGYVSGLFLRITGGEPYLYLQPLIFYPGYYPDDNGIYNQKFPFK 480 

| | | | | | M I I I I I I I I I I I I II I I I I I I I M II I I I I I I I I I I I I I I I I M I I I I I I I I I 

LLCVLFVKGTNTYGAVAGYVSGLFLRITGGEPYLYLQPLIFYPGYYPDDNGIYNQKFPFK 480 

TLT^WTSFLTNICISYLAKYLFESGTLPPKLDVFDAWARHSEENMDKTILVKNENIKLD 540 

1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 i i 

TLAIWTSFLTNICISYLAKYLFESGTLPPKL 54 0 

ELALVKPRQSMTLSSTFTNKEAFLDVDSSPEGSGTEDNLQ 58 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

ELALVKPRQSMTLSSTFTNKEAFLDVDSSPEGSGTEDNLQ 580 



RESULT 8 
AAB74664 

ID AAB74664 standard; protein; 580 AA. 
XX 

AC AAB74664; 
XX 

DT 01-JUN-2001 (first entry) 
XX 

DE Rat high affinity choline transporter protein. 
XX 

KW High affinity choline transporter; cho-1; Alzheimer's disease; diagnosis. 
XX 

OS Rattus norvegicus . 
XX 

PN WO200116315-A1. 
XX 

PD 08-MAR-2001. 
XX 

PF 18-AUG-2000; 2 000WO- JP005545 . 
XX 

PR 27-AUG-1999; 99 JP-002 40642 . 
PR 27-DEC-1999; 99 JP-00368991 . 
XX 

PA (NISC-) JAPAN SCI & TECHNOLOGY CORP. 
XX 

PI Haga T, Okuda T; 
XX 

DR WPI; 2001-226688/23. 
DR N-PSDB; AAF81711. 
XX 

PT New rat and human spinal cord high affinity choline transporters, useful 
PT in diagnosis of Alzheimer's disease and screening promoters as drugs for 
PT treating Alzheimer's disease. 
XX 



PS Claim 5; Page 69-71; 90pp; Japanese. 
XX 

CC The present sequence represents a rat (Rattus norvegicus) high affinity 

CC choline transporter protein designated cho-1. The cho-1 protein has 

CC nootropic and neuroprotective activities. The cho~l polynucleotide and 

CC protein can be used for the diagnosis of diseases related to the 

CC expression of cho-1 by comparing the cho-1 polynucleotide sequence in a , 

CC sample to that of a control. Drug compositions containing the cho-1 

CC protein or expression promoters or inhibitors of cho-1 are useful for 

CC treating disorders characterised by abnormal levels of cho-1, such as 

CC Alzheimer's disease 

XX 

SQ Sequence 580 AA; 

Query Match 94.9%; Score 2820; DB 4; Length 580; 

Best Local Similarity 93.1%; Pred. No. 7.6e-275; 

Matches 540; Conservative 24; Mismatches 16; Indels 0; Gaps ( 



Qy 1 MAFHVEGLIAIIVFYLLILLVGIWAAWRTKNSGSAEERSEAIIVGGRDIGLLVGGFTMTA 60 

I I I I I I I : I I I : I I I I I I I I I I I I I : I I I I I : I I I II II I I I I I I I I I I I I I I I I I I I 

Db 1 MPFHVEGLVAIILFYLLIFLVGIWAAWKTKNSGNAEERSEAIIVGGRDIGLLVGGFTMTA 60 

Qy 61 TWVGGGYI NGTAEAVYVPGYGLAWAQAPI GYS LS LI LGGLFFAKPMRS KGYVTMLDP FQQ 12 0 

I I I I I I I I I I I I I I I I M I I II I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I 

Db 61 TWVGGGYINGTAEAVYGPGCGLAWAQAPIGYSLSLILGGLFFAKPMRSKGYVTMLDPFQQ 120 

Qy 121 I YGKRMGGLLFI PALMGEMFWAAAI FSALGATI SVI IDVDMHI SVI I SALIATLYTLVGG 180 

I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I II I I : I I I I I I I I I I I I 
Db 121 I YGKRMGGLLFI PALMGEMFWAAAI FSALGATI SVI I DVDWI SVI VSALIAILYTLVGG 180 

Q y 181 L YS VAYT D WQ L FC I FVGLW I S VP FAL S H P AVAD I G FT AVHAK YQ K P WL GT VDS S EVY S W 240 

I I I I I I I I I I II I I I ! : I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I 111 = 1 
Db 181 L YS VAYT DWQL FC I FI GLWI S VP FAL SH P AVT D I GFT AVHAKYQ S PWLGT I E S VEVYT W 240 

Qy 241 LDSFLLLMLGGIPWQAYFQRVLSSSSATYAQVTjSFLAAFGCLVMAIPAILIGAIGASTDW 300 

I I : I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I : I I I I I I I I I I I I I 
Db 241 L DN FL LLMLGG I PWQAY FQ RVL S S S SAT YAQ VL S F LAAFGC LVMAL P AI C I GAI GAS T DW 300 

Qy 301 NQTAYGLPDPKTTEEADMILPIVLQYLCPVYISFFGLGAVSAAVMSSADSSILSASSMFA 360 

MINI I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db . 301 NQTAYGFPDPKTKEEADMI LPI VLQYLCPVYI S FFGLGAVSAAVMS SADSS I LSAS SMFA 360 

Qy 361 RNIYQLSFRQNASDKEIWVMRITVFVFGASAT^ 420 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I : I I I I 
Db 361 RNIYQLSFRQNASDKEIVV^RITVFVFGASATAMALLTKTVYGLWYLSSDLVYIIIFPQ 420 

Qy 421 LLCVLFVKGTNTYGAVAGYVSGLFLRITGGEPYLYLQPLIFYPGYYPDDNGIYNQKFPFK 4 80 

I I I I I I : I I I I I I I I I I I I : I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I : I I I I 

Db 421 LLCVLFIKGTNTYGAVAGYIFGLFLRITGGEPYLYLQPLIFYPGYYPDKNGIYNQRFPFK 480 

Qy 481 TIAMVTSFLTNICISYIxAKYLFESGTLPPKLDVFDAWARHSEENMDKTILVKNENIKLD 540 

I | : | I I II I I I I : I I I I I I I I I I I I I I I I I I : I I M I : I I I I I I I I I I I I I : I I I I I I : 
Db 481 TLSMVT S FFTN I CVS YLAKYLFES GTLP PKLDI FDAWS RH S EENMDKT I LVRNEN I KLN 540 



Qy 

Db 



541 
541 



RESULT 9 
ADD50643 

ID ADD50643 standard; protein; 580 AA. 
XX 

AC ADD50643; 
XX 

DT 15-JAN-2004 (first entry) 
XX 

DE Rat high-affinity choline transporter (rCHT) . 
XX 

KW Rat; high-affinity choline transporter; rCHT; cholinergic function; 

KW Parkinson's disease; Huntington's disease; Alzheimer ' s disease; 

KW schizophrenia; dysautonomia; myasthenia gravis; brain; 

KW cholinergic signalling; antiparkinsonian; anticonvulsant; nootropic; 

KW neuroprotective; neuroleptic. 

XX 

OS Rattus sp. 
XX 

PN US2003114399-A1. 
XX 

PD 19-JUN-2003. 
XX 

PF 23-JUL-2001; 2001US-00911077 . 
XX 

PR 23-JUL-2001; 2001US-00911077 . 
XX 

PA (BLAK/) BLAKELY R D. 

PA (APPA/) APPARSUNDARAM S. 

PA (FERG/) FERGUSON S. 

XX 

PI Blakely RD, Apparsundaram S, Ferguson S; 
XX 

DR WPI; 2003-810914/76. 

DR N-PSDB; ADD50642. 
XX 

PT Novel isolated polynucleotide encoding human or mouse high affinity 

PT choline transporter polypeptide, useful in gene therapy to increase 

PT cholinergic function in a cell of a patient suffering from Alzheimer ' s 

PT disease. 
XX 

PS Example 1; SEQ ID NO 6; 74pp; English. 
XX 

CC The present invention relates to the isolation of polynucleotide 

CC sequences encoding human and mouse high-affinity choline transporter 

CC (hCHT and mCHT respectively) , and the proteins they encode. The gene 

CC encoding hCHT is located on chromosome 2ql2. The polynucleotide sequence 

CC encoding hCHT, is useful for expressing hCHT recombinantly . The hCHT 

CC polynucleotide sequence when delivered to a cell, increases cholinergic 

CC function in the cell that is in a patient having Parkinson's disease, 

CC Huntington's disease, Alzheimer's disease, schizophrenia, dysautonomia or 

CC myasthenia gravis. The hCHT antibody is useful for controlling 

CC transporter CHT proteins to the brain, and for treating the above 

CC mentioned diseases. The antibody is also useful for diagnosing the above 

CC mentioned disorders and to detect the influence of cholinergic 

CC signalling. The present sequence represents rat CHT (rCHT) . Note: The 



CC sequence data for this patent was obtained in electronic format directly 

CC from the USPTO web site at seqdata.uspto.gov. 

XX 

SQ Sequence 580 AA; 

Query Match 94.9%; Score 2820; DB 7; Length 580; 

Best Local Similarity 93.1%; Pred. No. 7.6e-275; 

Matches 540; Conservative 24; Mismatches 16; Indels 0; Gaps 0; 

Qy 1 MAFHVEGLIAIIVFYLLILLVGIWAAWRTKNSGSAEERSEAIIVGGRDIGLLVGGFTMTA 60 

I I I I I I I : I I I : I I I I I I II I I I II : II I I I : I I I I I I I I I I I I I I I I I I I I M I I I I 

D b 1 MPFHVEGLVAIILFYLLIFLVGIWAAWKTKNSGNAEERSEAIIVGGRDIGLLVGGFTMTA 60 

Qy 61 TWVGGG Y I N GT AEAVYVP G YGLAWAQ AP I G Y S L S LI L GGL F FAK PMRS KG YVTML D P FQQ 12 0 

|| II I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I II I I M M I I M I I 

Db 61 TWVGGGYINGTAEAVYGPGCGLAWAQAPIGYSLSLILGGLFFAKPMRSKGYVTMLDPFQQ 120 

Qy 121 I YGKRMGGLLFI PALMGEMFWAAAI FSALGATI SVI I DVDMHI SVI I SALI ATLYTLVGG 180 

I I I M I I I M I I I I I I I I I I I I I I I I M II I II I I I I I I I :: I I I I : I I I M II I II II 

Db 121 IYGKRMGGLLFI PALMGEMFWAAAI FSALGATI SVI I DVDVNI SVI VSALIAILYTLVGG 180 

Qy 181 L Y S VAYT D WQL FC I FVGLW I S VP FAL S H P AVAD I G FT AVHAK YQ K P W L GT VD S S EVY S W 24 0 

II MM II I I I : I I II I I I I I I II I M MINIMUM MM MM I MM 

Db 181 L Y S VAYT DWQ L F C I F I G LW I S VP FAL S H P AVT DIG FT AVHAK YQ SPWLGTIES VE VYT W 240 

Qy 241 LD S FLLLML GG I PWQAY FQRVL S S S SAT YAQVL S FLAAFGC LVMAI P AI L I GAI GAS T DW 300 

I I : I || I I I I I I I I I I I I I I I I I I I I I II M I I I : I II I M M M I II 

Db 241 LDN FL L LML G GI PWQAY FQ RVL S S S SAT YAQVL S FLAAFGCLVMAL P AI C I GAI GAS T DW 300 

Qy 301 NQTAYGLPDPKTTEEADMILPIVLQYLCPVYISFFGLGAVSAAVMSSADSSILSASSMFA 360 

M M I I Mill II I I I II I I II II I I I I II I I I II II I I I I I I II II I M I II I I I M 

D b 301 NQTAYGFPDPKTKEEADMILPIVLQYLCPVYISFFGLGAVSAAVMSSADSSILSASSMFA 360 

Qy 361 RNIYQLSFRQNASDKEIVWVMRITVFVFGASATAMALLTKTVYGLWYLSSDLVYIVIFPQ 420 

I || | I || I I I I I I I I II I M II I I I I II M I I II I II I I I I I I I M II II II II M I M I 

Db 361 RN I YQLS FRQNAS DKEI VWVMRI TVFVFGASATAMALLTKTVYGLW YLS S DLVYI 1 1 FPQ 420 

Qy 421 LLCVLFVKGTNTYGAVAGYVSGLFLRITGGEPYLYLQPLIFYPGYYPDDNGIYNQKFPFK 48 0 

I I I I || : I M I I II I I I I M II I II II I I M M I I I M I II I I II I I I I M I M II M 
Db 421 LLCVLFI KGTNTYGAVAGYI FGLFLRITGGEPYLYLQPLI FYPGYYPDKNGI YNQRFPFK 480 

Qy 4 81 T LAMVT SFLTNICISY LAK YL FE S GT L P P KL D VF D AWARH S EENMD KT I LVKN EN I KL D 540 

M M I I I I I II M II M I I I I I I I I I I I I M M II II M II I I I I II I I I I : M M I M 

Db 481 TLSMVTSFFTNICVSYLAKYLFESGTLPPKLDIFDAWSRHSEENMDKTILVRNENIKLN 540 

Qy 541 ELALVKPRQSMTLSSTFTNKEAFLDVDSSPEGSGTEDNLQ 580 

Ml I II II I : II I I II I II II I I I II II I II I I I II I I 

Db 541 ELAPVKPRQSLTLSSTFTNKEALLDVDSSPEGSGTEDNLQ 580 



RESULT 10 
AAY72388 

ID AAY72388 standard; protein; 580 AA. 
XX 

AC AAY72388; 
XX 

DT 24-APR-2001 (first entry) 



XX 

DE Mouse P4P6B1 OMA (obese mice adipocyte) protein. 
XX 

KW Mouse; OMA protein; obese mice adipocyte; P4P6B1; 

KW fuel metabolism disorder; therapy; obesity; diabetes; gene therapy; 

KW anorectic; antidiabetic. 

XX 

OS Mus sp. 
XX 

PN WO200078950-A2. 
XX 

PD 28-DEC-2000. 
XX 

PF 13-JUN-2000; 2000WO-US016217 . 
XX 

PR 22-JUN-1999; 99US-0141515P . 
XX 

PA (AMYL-) AMYLIN PHARM INC. 
XX 

PI Sierzega M, Albrandt K; 
XX 

DR WPI; 2001-112322/12. 

DR N-PSDB; AAD02457. 
XX 

PT Novel obese mice adipocyte polypeptides useful in diagnosis and treatment 

PT of disorders of fuel metabolism such as obesity or diabetes. 

XX 

PS Claim 11; Fig 3; 83pp; English. 
XX 

CC The present sequence is mouse OMA (obese mice adipocyte) protein encoded 

CC by P4P6B1 cDNA. The P4P6B1 cDNA fragment was generated by RNA 

CC fingerprinting using random primers P4 and P6. OMA is used as a 

CC diagnostic reagent for diagnosing a disorder of fuel metabolism in an 

CC underweight or an overweight individual, by detecting the transcription 

CC level of a gene encoding OMA, which is induced or repressed in an 

CC individual by a factor such as genetic obesity, fasting and refeeding of 

CC a fasted individual. OMA is useful in the generation of antibodies, for 

CC use in pharmaceutical compositions and for studying DNA/protein 

CC interactions. Nucleic acids encoding OMA are involved in gene therapy. An 

CC inhibitor of OMA or an antisense oligonucleotide that inhibits expression 

CC of OMA are useful for treating disorders of fuel metabolism such as 

CC obesity or diabetes 

XX 

SQ Sequence 580 AA; 

Query Match 94.5%; Score 2810; DB 4; Length 580; 

Best Local Similarity 93.1%; Pred. No. 7.8e-274; 

Matches 540; Conservative 23; Mismatches 17; Indels 0; Gaps 0 

Qy 1 MAFHVEGLIAIIVFYLLILLVGIWAAWRTKNSGSAEERSEAIIVGGRDIGLLVGGFTMTA 60 

I : I I I I I I : I I I : I I I I I I I I I I I I I I : I I I I I : I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 1 MSFHVEGLVAIILFYLLILLVGIWAAWKTKNSGNPEERSEAIIVGGRDIGLLVGGFTMTA 60 



Qy 

Db 



61 TWVGGGYINGTAEAVYVPGYGLAWAQAPIGYSLSLILGGLFFAKPMRSKGYVTMLDPFQQ 120 

I I I I I I I I I I i I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I 

61 TWVGGGYINGTAEAVYGPGCGLAWAQAPIGYSLSLILGGLFFAKPMRSKGYVTMLDPFQQ 120 



QY 


121 


Db 


121 


Qy 


181 


Db 


181 


Qy 


241 


Db 


241 


Qy 


301 


Db 


301 


Qy 


361 


Db 


361 


Qy 


421 


Db 


421 


Qy 


481 


Db 


481 


Qy 


541 


Db 


541 



I YGKRMGGLLFI PALMGEMFWAAAI FSALGATI SVI I DVDMHI SVI I SALI ATLYTLVGG 180 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I : I M I I I I I I I I I 

I YGKRMGGLLFI PALMGEMFWAAAI FSALGATI SVI I DVDVNI SVI VSALI AI LYTLVGG 18 0 

L Y S VAYT DWQL FC I FVGLWI S VP FAL S H P AVAD I GFTAVHAKYQKP WLGT VD S S EVYS W 240 

I I I I I I I I I I I I I I I I : I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I :: I I I I : I 

L YS VAYT DWQLFC I FI GLWI S VP FAL S H P AVT DIG FT AVHAK YQ S PWLGT I E S VEVYTW 240 

LDS FLLLMLGGI PWQAYFQRVLS S SSAT YAQVLS FLAAFGCLVMAI PAI LI GAI GASTDW 300 

I | : | | | I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I : I I I I I I I I I I I I I 

L DN FL L LML G G I P WQ AY FQ RVL S S S SAT YAQ VL S FLAAFG C LVMAL PAI C I GAI GAS T DW 300 



I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 



RNIYQLSFRQNASDKEIVWVMRITVFVFGASATAMALLTKTVYGLWYLSSDLVYIVIFPQ 42 0 

I I I I I I I I I I I I I I I I II II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I : I I I I 

RNIYQLSFRQNASDKEIVWVMRITVLVFGASATAMALLTKTVYGLWYLSSDLVYIIIFPQ 42 0 

LLCVLFVKGTNTYGAVAGYVSGLFLRITGGEPYLYLQPLIFYPGYYPDDNGIYNQKFPFK 48 0 

I I I I I I : I I I I I I I I I I I I : II I I I I I I I I I I I I I I I I I I I I I I I I 111111:1111 

LLCVLFIKGTNTYGAVAGYIFGLFLRITGGEPYLYLQPLIFYPGYYSDKNGIYNQRFPFK 48 0 

TLAMVTSFLTNICISYLAKYLFESGTLPPKLDVFDAVVARHSEENMDKTILVKNENIKLD 54 0 

I | : | II I I I I I I : I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I ! I I I : I I I I I I : 
T L SMVT S FFTN I CVS YLAKYL FE S GT L P P KLDVFDAWARH S EENMD KT I LVRNEN I KLN 54 0 

ELALVKPRQSMTLSSTFTNKEAFLDVDSSPEGSGTEDNLQ 580 

III II I I I I : I I I I I I I I II I I I I I I I I I I I I I I I I I I 
ELAPVKPRQSLTLSSTFTNKEALLDVDSSPEGSGTEDNLQ 580 



RESULT 11 


AAB74666 


ID 


AAB74666 standard; protein; 580 AA. 


XX 




AC 


AAB74666; 


XX 




DT 


01-JUN-2001 (first entry) 


XX 




DE 


Mouse high affinity choline transporter protein. 


XX 




KW 


High affinity choline transporter; cho-1; Alzheimer's disease; diagnosis. 


XX 




OS 


Mus mus cuius. 


XX 




PN 


WO200116315-A1. 


XX 




PD 


08-MAR-2001. 


XX 




PF 


18-AUG-2000; 2000WO- JP005545 . 


XX 




PR 


27-AUG-1999; 99 JP-00240642 . 


PR 


27-DEC-1999; 99 JP-00368 991 . 


XX 




PA 


(NISC-) JAPAN SCI & TECHNOLOGY CORP. 



XX 

PI Haga T, Okuda T; 
XX 

DR WPI; 2001-226688/23. 

DR N-PSDB; AAF81713. 
XX 

PT New rat and human spinal cord high affinity choline transporters, useful 

PT in diagnosis of Alzheimer's disease and screening promoters as drugs for 

PT treating Alzheimer's disease. 
XX 

PS Claim 11; Page 82-85; 90pp; Japanese. 
XX 

CC The present sequence represents a mouse (Mus musculus) high affinity 

CC choline transporter protein designated cho-1. The cho-1 protein has 

CC nootropic and neuroprotective activities. The cho-1 polynucleotide and 

CC protein can be used for the diagnosis of diseases related to the 

CC expression of cho-1 by comparing the cho-1 polynucleotide sequence in a 

CC sample to that of a control. Drug compositions containing the cho-1 

CC protein or expression promoters or inhibitors of cho-1 are useful for 

CC treating disorders characterised by abnormal levels of cho-1, such as 

CC Alzheimer's disease 
XX 

SQ Sequence 580 AA; 

Query Match 94.2%; Score 2801; DB 4; Length 580; 

Best Local Similarity 92.8%; Pred. No. 6.3e-273; 

Matches 538; Conservative 23; Mismatches 19; Indels 0; Gaps 0; 

1 MAFHVEGLIAIIVFYLLILLVGIWAAWRTKNSGSAEERSEAIIVGGRDIGLLVGGFTMTA 60 

| : | M I || : I I I : I I I I I I I I I I I I I : I I I I I : II I I I I I I I I I I I I I I I I I I I I I I 
1 MSFHVEGLVAIILFYLLIFLVGIWAAWKTKNSGNPEEHSEAIIVGGRDIGLLVGGFTMTA 60 

61 TWVGGGYINGTAEAVYVPGYGLAWAQAPIGYSLSLILGGLFFAKPMRSKGYVTMLDPFQQ 120 

M I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

61 TWVGGGYINGTAEAVYGPGCGLAWAQAPIGYSLSLILGGLFFAKPMRSKGYVTMLDPFQQ 120 

121 I YGKRMGGLLFI PALMGEMFWAAAI FSALGATI SVI I DVDMHI SVI I SALI ATLYTLVGG 180 

I | | | | | I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M : : I I I I : I I I M I I I I I I I 
121 I YGKRMGGLLFI PALMGEMFWAAAI FS ALGAT I SVI I DVDVNI S VI VS ALI AI LYT LVGG 180 

181 L Y S VAYT DWQ L FC I FVG LW I S VP FAL S H P AVAD I G FT AVHAK YQ K P W L GT VD S S E VY S W 240 

I | || I I I I I I I I I II I : I I II I I I I I I I I I I I I I II I I I I I I I I I I I I I :: I I I I : I 
181 L YS VAYT DWQL FC I FI GLW I S VP FALS H P AVT D I GFT AVHAK YQ S PWLGT I E S VEVYTW 240 

241 LDS FLLLMLGGI PWQAYFQRVLS S S SATYAQVLS FLAAFGCLVMAI PAI LI GAI GASTDW 300 

I I : I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I : I I I I I I I I I I I I I 
241 L DN F L L LML G G I P WQAY FQ RVL S S S SAT YAQ VL S FLAAFGC L VMAL PAI C I GAI GAS T DW 300 

301 NQTAYGLPDPKTTEEADMI LP I VLQ YLCPVYI S FFGLGAVS AAVMS S ADS S I LS AS SMFA 360 

| I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
301 NQTAYGYPDPKTKEEADMILPIVLQYLCPVYISFFGLGAVSAAVMSSADSSILSASSMFA 360 

361 RN I YQLS FRQNAS DKEI VWVMRI TVFVFGAS ATAMALLTKTVYGLWYLS S DLVYI VI FPQ 420 

| | | | I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I II I I I I I II II : I I I I 
361 RN I YQL S FRQNAS DKEI VWVMRI TVLVFGAS AT AMALLT KTVYGLWYLS S DLVYI 1 1 FPQ 420 

421 LLCVLFVKGTNTYGAVAGYVSGLFLRITGGEPYLYLQPLIFYPGYYPDDNGIYNQKFPFK 4 80 



Qy 

Db 

Qy 
Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 



I I I I I I * I I I I I I I i I I I I - I I I I I I I I I I I ' 

421 LLCVLFIKGTNTYGAVAGYIFGLFLRITGGEPYLYLQPLIFYPGYYSDKNGIYNQRFPFK 480 

4 81 TLAJytVTSFLTNICISYLAKYLFESGTLPPKLDVFDAWTVRHSEENMDKTILV 54 0 

||:||||| | | | | : I I I I I I I I I I I I II M I I ! I i I I I I I I I I I I i I I I I I I - I I I M I : 
481 TL SMVT S F FTN I CVS YLAK YL FE S GT LP P KLDVFDAWARH S EENMDKT I LVRNEN I KLN 540 

541 ELALVKPRQSMTLSSTFTNKEAFLDVDSSPEGSGTEDNLQ 580 

I I I I I I I I I : I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

541 ELAPVKPRQSLTLSSTFTNKEALLDVDSSPEGSGTEDNLQ 580 



RESULT 12 
ADD50641 

ID ADD50641 standard; protein; 580 AA. 
XX 

AC ADD50641; 
XX 

DT 15-JAN-2004 (first entry) 
XX 

DE Mouse high-affinity choline transporter (raCHT) #1. 
XX 

KW Mouse; - high-affinity choline transporter; mCHT; cholinergic function; 

KW Parkinsons disease; Huntington's disease; Alzheimer's disease; 

KW schizophrenia; dysautonomia; myasthenia gravis; brain; 

KW cholinergic signalling; antiparkinsonian; anticonvulsant; nootropic; 

KW neuroprotective; neuroleptic. 

XX 

OS Mus sp. 
XX 

PN US2003114399-A1. 
XX 

PD 19-JUN-2003. 
XX 

PF 23-JUL-2001; 2001US-00911077 . 
XX 

PR 23-JUL-2001; 2001US-00911077 . 
XX 

PA (BLAK/) BLAKELY R D. 

PA (APPA/) APPARSUNDARAM S. 

PA (FERG/) FERGUSON S. 

XX 

PI Blakely RD, Apparsundaram S, Ferguson S; 
XX 

DR WPI; 2003-810914/76. 

DR N-PSDB; ADD50640. 
XX 

PT Novel isolated polynucleotide encoding human or mouse high affinity 

PT choline transporter polypeptide, useful in gene therapy to increase 

PT cholinergic function in a cell of a patient suffering from Alzheimer's 

PT disease. 
XX 

PS Claim 29; SEQ ID NO 4; 74pp; English. 
XX 

CC The present invention relates to the isolation of polynucleotide 

CC sequences encoding human and mouse high-affinity choline transporter 

CC (hCHT and mCHT respectively), and the proteins they encode. The gene 



CC encoding hCHT is located on chromosome 2ql2. The polynucleotide sequence 

CC encoding hCHT, is useful for expressing hCHT recombinantly . The hCHT 

CC polynucleotide sequence when delivered to a cell, increases cholinergic 

CC function in the cell that is in a patient having Parkinson's disease, 

CC Huntington's disease, Alzheimer's disease, schizophrenia, dysautonomia or 

CC myasthenia gravis. The hCHT antibody is useful for controlling 

CC transporter CHT proteins to the brain, and for treating the above 

CC mentioned diseases. The antibody is also useful for diagnosing the above 

CC mentioned disorders and to detect the influence of cholinergic 

CC signalling. The present sequence represents mCHT. Note: The sequence data 

CC for this patent was obtained in electronic format directly from the USPTO 

CC web site at seqdata.uspto.gov. 

XX 

SQ Sequence 580 AA; 

Query Match 94.0%; Score 2795; DB 7; Length 580; 

Best Local Similarity 92.6%; Pred. No. 2.5e-272; 

Matches 537; Conservative 23; Mismatches 20; Indels 0; Gaps 0; 

MAFHVEGLI AI I VFYLLI LLVGIWAAWRTKNSGSAEERS EAI I VGGRDI GLLVGGFTMTA 60 

I I I I I 11:1 I 1:1 II M I MM I I 1:11 I M: I I I I M I I M I M I M I M I 

MPFHVEGLVAIILFYLLIFLVGIWAAWKTKNSGNPEERS EAI I VGGRDI GLLVGGFTMTA 60 

TWVGGGYI NGT AEAVYVP G YGLAWAQAP I G Y SLSLILGGLF FAKPMRS KG YVTML D P FQQ 120 
M M M M M M M I I I I M M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I : I 

TWVGGGYI NGTAEAVYGPGCGLAWAHAP I GYSLS LI LGGLFFAKPMRS KGYVTMLDP FKQ 120 

I YGKRMGGLLFI PALMGEMFWAAAI FSALGATI SVI I DVDMHI SVI I SALI ATLYTLVGG 180 
M I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I M I I I I I I I I : I I I I I I I I I I I I 



I I I I I I I I I I I I M I I : I I I II I M I I I I I I I I II I II I II II I I I I I I :: I I I I : I 



LDS FLLLMLGGI PWQAYFQRVLS S S SAT YAQVLS FLAAFGC LVMAI PAILIGAI GASTDW 300 

I | : | | | I I II I I I II I I I I I I I I I I II I I I I M II:IM I I I I I I I I I I 

LDN FLLLMLGG I PWQAY FQRVLS S S S AT YAQVL S FLAAFGC LVMALPAI C I GAI GAS T DW 300 

NQTAYGLPDPKTTEEADMILPIVLQYLCPVYISFFGLGAVSAAVMSSADSSILSASSMFA 360 

|| M | | || | I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I 

NQTAYGYPDPKTKEE7VDMILPIVLQYLCPVYISFFGLGAVSAAVMSSADSSILSASSMFA 360 

RNIYQLSFRQNASDKEIVWWRITVFVFGASATAMALLTKTVYGLWYLSSDLVYIVIFPQ 420 

I | | | I M II I II I I I I II M I I M I I I I I I I I I M I I I I I I I I I I I I I : I I I I 

RNI YQLS FRQNAS DKEI VWVMRI T VLVFGAS ATAMALLT KTVYGLW YLS S DLVYI 1 1 FPQ 42 0 

LLCVLFVKGTNTYGAVAGYVSGLFLRITGGEPYLYLQPLIFYPGYYPDDNGIYNQKFPFK 480 

M | | | | : | | I I || M I I I I : I I I I I I I I I I I I I I I I I I I I I I I I I I IIIIMUIII 

LLCVLFIKGTNTYGAVAGYIFGLFLRITGGEPYLYLQPLIFYPGYYSDKNGIYNQRFPFK 480 

TLAMVT'SFLTNICISYLAKYLFESGTLPPKLDVFDAWARHSEENMDKTILVKNENIKLD 540 

| | : M || | | | | | : I I II I I M I I I I I I I I II I I I I I II I I I I I I I I I I I I I : M II I I : 

T L SMVT S FFTN I CVS Y LAKY LFESGTLPP KL D VF DAWARH S E ENMD KT I L VRN EN I KLN 540 
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DD 
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wy 
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UD 


£1 

D ± 


Ov 


121 


Db 


121 


Qy 


181 


Db 


181 


Qy 


241 


Db 


241 


Qy 


301 


Db 


301 


Qy 


361 


Db 


361 


Qy 


421 


Db 


421 


Qy 


481 


Db 


481 



Qy 



541 ELALVKPRQSMTLSSTFTNKEAFLDVDSSPEGSGTEDNLQ 580 
III |||IM:|||IIIIIIII I I I I I I I I I I I I I I I I I 



Db 541 ELAPVKPRQSLTLSSTFTNKEALLDVDSSPEGSGTEDNLQ 580 



RESULT 13 
ADD50661 

ID ADD50661 standard; protein; 580 AA. 
XX 

AC ADD50661; 
XX 

DT 15-JAN-2004 (first entry) 
XX 

DE Mouse high-affinity choline transporter (mCHT) #2. 
XX 

KW Mouse; high-affinity choline transporter; mCHT; cholinergic function; 

KW Parkinson's disease; Huntington's disease; Alzheimer's disease; 

KW schizophrenia; dysautonomia; myasthenia gravis; brain; 

KW cholinergic signalling; antiparkinsonian; anticonvulsant; nootropic; 

KW neuroprotective; neuroleptic. 

XX 

OS Mus sp. 
XX 

PN US2003114399-A1. 
XX 

PD 19-JUN-2003. 
XX 

PF 23-JUL-2001; 2001US-00911077 . 
XX 

PR 23-JUL-2001; 2001US-00911077 . 
XX 

PA (BLAK/) BLAKELY R D. 

PA (APPA/) APPARSUNDARAM S. 

PA (FERG/) FERGUSON S. 

XX 

PI Blakely RD, Apparsundaram S, Ferguson S; 
XX 

DR WPI; 2003-810914/76. 

DR N-PSDB; ADD50660. 
XX 

PT Novel isolated polynucleotide encoding human or mouse high affinity 

PT choline transporter polypeptide, useful in gene therapy to increase 

PT cholinergic function in a cell of a patient suffering from Alzheimer's 

PT disease. 
XX 

PS Disclosure; SEQ ID NO 24; 74pp; English. 
XX 

CC The present invention relates to the isolation of polynucleotide 

CC sequences encoding human and mouse high-affinity choline transporter 

CC (hCHT and mCHT respectively), and the proteins they encode. The gene 

CC encoding hCHT is located on chromosome 2ql2. The polynucleotide sequence 

CC encoding hCHT, is useful for expressing hCHT recombinantly . The hCHT 

CC polynucleotide sequence when delivered to a cell, increases cholinergic 

CC function in the cell that is in a patient having Parkinson's disease, 

CC Huntington's disease, Alzheimer's disease, schizophrenia, dysautonomia or 

CC myasthenia gravis. The hCHT antibody is useful for controlling 

CC transporter CHT proteins to the brain, and for treating the above 

CC mentioned diseases. The antibody is also useful for diagnosing the above 

CC mentioned disorders and to detect the influence of cholinergic 



CC signalling. The present sequence represents mCHT. Note: The sequence data 

CC for this patent was obtained in electronic format directly from the USPTO 

CC web site at seqdata.uspto.gov. 
XX 

SQ Sequence 580 AA; 

Query Match 94.0%; Score 2795; DB 7; Length 580; 

Best Local Similarity 92.6%; Pred. No. 2.5e~272; 

Matches 537; Conservative 23; Mismatches 20; Indels 0; Gaps 0 

1 MAFHVEGLIAIIVFYLLILLVGIWAAWRTKNSGSAEERSEAIIVGGRDIGLLVGGFTMTA 60 

I I I I I I I : I I I : II II I I I I I I I I I : I I I I I : I I I I I I I I I I I I I I I I I I I I I I I I I 

1 MPFHVEGLVAI I LFYLLI FLVGIWAAWKTKNSGNPEERSEAI IVGGRDI GLLVGGFTMTA 60 

61 TWVGGG Y I N GT AEAVYVP G YGLAWAQAP I G YS L S L I LGGL F FAK PMRS KG YVTML D P FQQ 120 

I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I M I I I I I I I I I I II I I I I I I I I I I : I 
61 TWVGGGYI NGTAEAVYGPGCGLAWAHAP I GYSLS LI LGGLFFAKPMRS KGYVTMLDP FKQ 120 

121 I YGKRMGGLLFI PALMGEMFWAAAI FSALGATI SVI I DVDMHI SVI I SALIATLYTLVGG 180 

I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I M I I I I I I :: I I I I : I I I I I I II II I I 
121 I YGKRMGGLLFI PALMGEMFWAAAI FSALGATI SVI I DVDVNI SVIVSALIAILYTLVGG 180 

181 LYSVAYTDWQLFCIEVGLWISVPFALSHPAVADIGFTAVHAKYQKPWLGTVDSSEVYSW 240 

I I I I I I I I I I I I I I I I : I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I : : I I I I : I 

181 LYS VAYTDWQLFCI FI GLWI SVPFALSHPAVTDI GFTAVHAKYQS PWLGTI ESVEVYTW 240 



Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 



Qy 241 LD S FLL LML GG I PWQAY FQ RVL S S S SAT YAQVL S FLAAFG C LVMAI P AI L I GAI GAS T DW 300 

M : I I I I I I I I I II I I I I I : I I I I I I I I 

Db 241 LDN FL L LML GG I PWQAY FQ RVL S S S SAT YAQVL S FLAAFGC LVMAL P AI C I GAI GAS T DW 300 

Qy 301 NQTAYGLPDPKTTEEADMI LPIVLQYLCPVYI S FFGLGAVSAAVMS SADS S I LSAS SMFA 360 

I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 301 NQTAYGYP DPKTKEEADMI LP I VLQYLC PVYI S FFGLGAVSAAVMS SADS S I LSAS SMFA 360 

Qy 361 RN I YQLS FRQNAS DKEI VWVMRI TVFVFGASATAMALLTKT VYGLWYLS S DLVYI VI FPQ 420 

I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I : I I I I 
Db 361 RN I YQL S FRQNAS DKE I VWVMRI TVLVFGAS AT AMALLT KT VYGLW YL S S DLVYI 1 1 F PQ 42 0 

Qy 421 LLCVLFVKGTNTYGAVAGYVSGLFLRITGGEPYLYLQPLIFYPGYYPDDNGIYNQKFPFK 480 

I I I I || : I I I I I I I I II I I : I II I I I I I I I I I I I I I I I I I I I I I I I 111111:1111 
Db 421 LLCVLFIKGTNTYGAVAGYIFGLFLRITGGEPYLYLQPLIFYPGYYSDKNGIYNQRFPFK 480 

Qy 481 TLAMVTSFLTNICISYLAKYLFESGTLPPKLDVFDAWARHSEENMDKTILVKNENIKLD 540 

Ihlllll I I I I : I I I I I I I I I I I I I I I I I I I II I I I II I I I I I I I I I I I I : I I I I I I : 

Db 481 TLSMVTSFFTNICVSYLAKYLFESGTLPPKLDVFDAWARHSEENMDKTILVRNENIKLN 540 

Qy 541 ELALVKPRQSMTLSSTFTNKEAFLDVDSSPEGSGTEDNLQ 580 

III I I I I I I : I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 541 ELAPVKPRQSLTLS ST FTNKEALLDVDS S PEGSGTEDNLQ 58 0 



RESULT 14 
AAB74663 

ID AAB74663 standard; protein; 576 AA. 
XX 

AC AAB74663; 
XX 



DT 01-JUN-2001 (first entry) 
XX 

DE C. elegans high affinity choline transporter protein. 
XX 

KW High affinity choline transporter; cho-1; Alzheimer's disease; diagnosis. 
XX 

OS Caenorhabditis elegans. 
XX 

PN WO200116315-A1. 
XX 

PD 08-MAR-2001. 
XX 

PF 18-AUG-2000; 2000WO- JP005545 . 
XX 

PR 27-AUG-1999; 99 JP-00240642 . 

PR 27-DEC-1999; 99 JP-00368991 . 
XX 

PA (NISC-) JAPAN SCI & TECHNOLOGY CORP. 
XX 

PI Haga T, Okuda T; 
XX 

DR WPI; 2001-226688/23. 

DR N-PSDB; AAF81710. 
XX 

PT New rat and human spinal cord high affinity choline transporters, useful 

PT in diagnosis of Alzheimer's disease and screening promoters as drugs for 

PT treating Alzheimer's disease. 
XX 

PS Claim 2; Page 62-64; 90pp; Japanese. 

CC The present sequence represents a Caenorhabditis elegans high affinity 

CC choline transporter protein designated cho-1. The cho-1 protein has 

CC nootropic and neuroprotective activities. The cho-1 polynucleotide and 

CC protein can be used for the diagnosis of diseases related to the 

CC expression of cho-1 by comparing the cho-1 polynucleotide sequence in a 

CC sample to that of a control. Drug compositions containing the cho-1 

CC protein or expression promoters or inhibitors of cho-1 are useful for 

CC treating disorders characterised by abnormal levels of cho-1, such as 

CC Alzheimer's disease 
XX 

SQ Sequence 57 6 AA; 

Query Match 48.9%; Score 1453; DB 4; Length 576; 
Best Local Similarity 50.5%; Pred. No. 4.7e-137; 

Matches 295; Conservative 95; Mismatches 150; Indels 44; Gaps 9; 

Qy 7 GLIAIIVFYLLILLVGIWAAWRTKNSGSAEER S EAI I VGGRDI GLLVGGFTMTATW 62 

| : : I I : I I : I I I : I I I I I : : I : I I : | : : : I I : I I I I I I I I I I I I 

Db 6 G I VAI VF FYVL I LWG I WAGRK S KSSKELES EAGAAT E EVMLAG RN I GT LVG I FTMT ATW 65 

Qy 63 VGGGYINGTAEAVYVPGYGLAWAQAPIGYSLSLILGGLFFAKPMRSKGYVTMLDPFQQIY 122 

M | | M | | | | | : | II I I I : I I :: I I :: I I I I I I I I : I I : I I I I I I I I 

D b 66 VGGAYINGTAEALY— NGGLLGCQAPVGYAISLVMGGLLFAKKMREEGYITMLDPFQHKY 123 



Qy 

Db 



123 GKRMGGLLFI PALMGEMFWAAAI F SAL GAT I SVI I DVDMHI SVI I SALIATLYTLVGGLY 182 
| : | : | | | : : : I I I : I I II III I I I I I I : I I I : : I I : I I : I I M II II I 

124 GQRI GGLMYVPALLGET FWTAAI LSALGATLS VT LGI DMNASVTLSACIAVFYT FTGGYY 183 



Qy 183 4 SVAYTDWQLFCIFVGLWISVPFALSHPAVADIGFTAVHAKYQKPWLGTVDS-SEVYSWL 241 

: I I I I I I I I I I I I I I I I I : I I I : I M I I : I : I ' : 

D b 184 AVAYTDWQLFCIFVGLWVCVPAAMVHDGAKDISRNA GDWIGEIGGFKETSLWI 237 

Qy 242 DS FLLLMLGGI PWQAYFQRVLS S SSAT YAQVLS FLAAFGCLVMAI PAI LI GAI GASTDWN 301 

I III: MM I I II II II II :l II II Ml I I:: II II Mill MM 
Db 238 DCMLLLVFGGI PWQVYFQRVLS SKTAHGAQTLS FVAGVGCI LMAI PPALI GAIARNTDWR 297 

Qy 302 QTAYGLPDPKTTEEA DMI LP I VLQYLCPVYI S FFGLGAVSAAVMS SADS S I LSA 355 

| | : I I : : | : : | : | III I : : : I M I I I I I II II II I I I M M 

D b 298 MTDYS PWNNGTKVES I PPDKRNMWPLVFQYLTPRWVAFI GLGAVSAAVMSSADS SVLSA 357 

Qy 356 S SMFARN I YQL S FRQNAS DKE I VWVMRI T VFVFGAS ATAMAL LT KTVYGLW YL S S DLVYI 415 

M I I I I I :M : I M I M I :: I I I I : I II III : : M I I I I I M I I I : 
Db 358 ASMFAHNI WKLT I RPHAS EKEVI I VMRI AI I CVGIMAT IMALT I QS I YGLWYLCADLVYV 417 

Qy 416 VIFPQLLCVLFVKGTNTYGAVAGYVSGLFLRITGGEPYLYLQPLIFYPGYYPDDNGIYNQ 475 

: : | | | | | I I : : : : I I I I : : I I I II I I : I I I I : I I I I M : I 

D b 418 ILFPQLLCWYMPRSNTYGSLAGYAVGLVLRLIGGEPLVSLPAFFHYPMY TDGV— Q 472 

Qy 476 KFPFKTLA3WTSFLTNICISYLAKYLFESGTLPPKLDVFDAW ARHSEENMDKTILV 532 

I II M I I : M I : I : : I I M I I I : I I II I I : I 

Db 473 YFPFRTTAMLSSMATIYIVSIQSEKLFKSGRLSPEWDVMGCWNIPIDHVPLPSDVSFAV 532 

Qy 533 KNENIKL DELALVKPRQSMTLSSTFTN 559 

:\ : : II h I : I I M 

Db 533 SSETLNMKAPNGTPAPVHPNQQPSDENTLLHPYSDQSYYSTNSN 576 



RESULT 15 
ADD50645 

ID ADD50645 standard; protein; 576 AA. 
XX 

AC ADD50645; 
XX 

DT 15-JAN-2004' (first entry) 
XX 

DE C. elegans CHOI protein. 
XX 

KW High-affinity choline transporter; CHT; cholinergic function; 

KW Parkinson's disease; Huntington's disease; Alzheimer's disease; 

KW schizophrenia; dysautonomia ; myasthenia gravis; brain; 

KW cholinergic signalling; antiparkinsonian; anticonvulsant; nootropic; 

KW neuroprotective; neuroleptic; CHOI. 

XX 

OS Caenorhabditis elegans. 
XX 

PN US2003114399-A1. 
XX 

PD 19-JUN-2003. 
XX 

PF 23-JUL-2001; 2001US-00911077 . 
XX 

PR 23-JUL-2001; 2001US-00911077 . 
XX 

PA (BLAK/) BLAKELY R D. 



PA (APPA/) AP PARSUNDARAM S. 

PA (FERG/) FERGUSON S. 

XX 

PI Blakely RD, Apparsundaram S, Ferguson S; 
XX 

DR WPI; 2003-810914/76. 
XX 

PT Novel isolated polynucleotide encoding human or mouse high affinity 

PT choline transporter polypeptide, useful in gene therapy to increase 

PT cholinergic function in a cell of a patient suffering from Alzheimer's 

PT disease. 
XX 

PS Disclosure; SEQ ID NO 8; 74pp; English. 
XX 

CC The present invention relates to the isolation of polynucleotide 

CC sequences encoding human and mouse high-affinity choline transporter 

CC (hCHT and mCHT respectively), and the proteins they encode. The gene 

CC encoding hCHT is located on chromosome 2ql2. The polynucleotide sequence 

CC encoding hCHT, is useful for expressing hCHT recombinantly . The hCHT 

CC polynucleotide sequence when delivered to a cell, increases cholinergic 

CC function in the cell that is in a patient having Parkinson's disease, 

CC Huntington's disease, Alzheimer's disease, schizophrenia, dysautonomia or 

CC myasthenia gravis. The hCHT antibody is useful for controlling 

CC transporter CHT proteins to the brain, and for treating the above 

CC mentioned diseases. The antibody is also useful for diagnosing the above 

CC mentioned disorders and to detect the influence of cholinergic 

CC signalling. The present sequence represents Caenorhabditis elegans CHOI 

CC protein. Note: The sequence data for this patent was obtained in 

CC electronic format directly from the USPTO web site at seqdata.uspto.gov. 

XX 

SQ Sequence 576 AA; 

Query Match 48.9%; Score 1453; DB 7; Length 576; 

Best Local Similarity 50.5%; Pred. No. 4.7e-137; 

Matches 295; Conservative 95; Mismatches 150; Indels 44; Gaps 9 



Qy 



7 GLI AI IVFYLLI LLVGI WAAWRTKNSGSAEER SEAIIVGGRDIGLLVGGFTMTATW 62 



Db 



l . . I l . ii • i ii • i ii ii . . j . i i • i ■ • • i i • i i iii i i i i i i i 

6 GIVAIVFFYVLILWGIWAGRKSKSSKELESEAGAATEEVMLAGRNIGTLVGIFTMTATW 65 



QY 



63 VGGGYINGTAEAVYVPGYGLAWAQAPI GYSLSLI LGGLFFAKPMRS KGYVTMLDPFQQI Y 122 



III I I I I I I I I • I ii i i i • i i • * i i • • I i i iii ii -ii- i i ' 

66 VGGAYINGTAEALY — NGGLLGCQAPVGYAISLVMGGLLFAKKMREEGYITMLDPFQHKY 123 




Db 



Qy 



123 GKRMGGLLFI PALMGEMFWAAAI F SAL GAT I SVI I DVDMHI SVI I SALI ATLYTLVGGLY 182 



l • l • l l I • • * I I I ■ i i ii mi i -iii- -ii- ii • 

124 GQ RI GGLMYVPALLGET FWTAAI L SAL GAT L SVI LGI DMNAS VTLSAC I AVFYT FTGGYY 183 




Db 



Qy 



Db 



183 SVAYTDWQLFCIFVGLWISVPFALSHPAVADIGFTAVHAKYQKPWLGTVDS-SEVYSWL 241 

: I I I I I I I I I I I I I I I I I : I I I : I II I I : I : I I : 

184 AVAYTDWQLFCI FVGLWVCVPAAMVHDGAKDI SRNA GDWIGEIGGFKETSLWI 237 



Qy 



242 DS FLLLMLGGI PWQAYFQRVLS S S SAT YAQVLSFLAAFGCLVMAI PAI LI GAI GASTDWN 301 



Db 



I iii- i i i i i i i i i i i i i i • i i » i i i • i ii i -iii 

238 D CML L L V F G G I P WQ VY FQ RVL S S KT AH G AQT L S FVAGVG C I LMAI P PAL I GAI ARN T DW R 297 



Qy 



302 QTAYGLPDPKTTEEA DMI LP I VLQYLCPVYI S FFGLGAVSAAVMS S ADS S I LS A 355 



II : | |: :|::|:| III I :::| I I I I I I I I I I I I I I I I : I I I 

Db 298 MT D Y S PWNNGT KVE SIP P DKRNMWP LVFQ YLT P RWVAFI GLGAVS AAVMS SAD S S VL S A 357 

Qy 356 S SMFARN I YQLS FRQNAS DKEI VWVMRI TVFVFGASATAMALLT KTVYGLW YLS S DLVYI 415 

: I I I I I I :: I : I : I I : I I :: I I I I : I Mill : : : I I I I I I : I I I I : 
Db 358 ASMFAHNIWKLTIRPHASEKEVIIVMRIAIICVGIMATIl^TIQSIYGLWYLCADLVYV 417 

Qy 416 VIFPQLLCVLFVKGTNTYGAVAGYVSGLFLRITGGEPYLYLQPLIFYPGYYPDDNGIYNQ 475 

: : I I I I I II : : : : I I I I : : I I I I I I I : I I I I : I Ml : I : I 
Db 418 ILFPQLLCWYMPRSNTYGSLAGYAVGLVLRLIGGEPLVSLPAFFHYPMY TDGV — Q 472 

Qy 476 KFPFKTIAMVTSFLTNICISYLAKYLFESGTLPPKLDVFDAW ARHSEENMDKTILV 532 

I I |: I ||:: I I :| :: 11:11 I I: II II I I : I 

Db 473 YFPFRTTAMLSSMATIYIVSIQSEKLFKSGRLSPEWDVMGCWNIPIDHVPLPSDVSFAV 532 

Qy 533 KNENIKL DELALVKPRQSMTLSSTFTN 559 

: I : : I I I : I : I I : I 

Db 533 SSETLNMKAPNGTPAPVHPNQQPSDENTLLHPYSDQSYYSTNSN 57 6 



Search completed: March 22, 2004, 15:32:18 
Job time : 118 sees 



GenCore version 5.1.6 
Copyright (c) 1993 - 2004 Compugen Ltd. 



OM protein - protein search, using sw model 
Run on: March 22, 2004, 15:30:29 



Search time 33 Seconds 

(without alignments) 

907.366 Million cell updates/sec 



Title: 

Perfect score: 
Sequence : 



US-10-069-541-6 
2972 

1 MAFHVEGLIAII VFYLLILL . 



. EAFLDVDSSPEGSGTEDNLQ 58 0 



389414 



Scoring table: BLOSUM62 

Gapop 10.0 , Gapext 0.5 

Searched: 389414 seqs, 51625971 residues 

Total number of hits satisfying chosen parameters: 

Minimum DB seq length: 0 

Maximum DB seq length: 2000000000 

Post-processing: Minimum Match 0% 

Maximum Match 100% 
Listing first 45 summaries 

Database : Issued_Patents_AA: * 

1: /cgn2_6/ptodata/2/iaa/5A_COMB.pep:* 

2: /cgn2_6/ptodata/2/iaa/5B_COMB.pep: * 

3: /cgn2_6/ptodata/2/iaa/6A_COMB.pep:* 

4: /cgn2_6/ptodata/2/iaa/6B_COMB.pep: * 

5: /cgn2_6/ptodata/2/iaa/PCTUS_COMB.pep:* 

6: /cgn2_6/ptodata/2/iaa/backfilesl .pep: * 

Pred. No. is the number of results predicted by chance to have a 
score greater than or equal to the score of the result being printed, 
and is derived by analysis of the total score distribution. 
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ALIGNMENTS 



RESULT 1 
US-09-657-252-2 

Sequence 2, Application US/09657252 
Patent No. 6500643 
GENERAL INFORMATION: 
APPLICANT: Wu, Dong-Hai 
APPLICANT: Gu, Yunrong 
APPLICANT: Millard, William 
APPLICANT: He, Yun-Je 

TITLE OF INVENTION: Human High Affinity Choline Transporter cDNA 
FILE REFERENCE: MBHB00-639 
CURRENT APPLICATION NUMBER: US/09/657 , 252 
CURRENT FILING DATE: 2000-09-07 
NUMBER OF SEQ ID.NOS: 6 
SOFTWARE: Patentln version 3.0 
SEQ ID NO 2 
LENGTH: 580 
TYPE: PRT 



; ORGANISM: Homo sapiens 
US-09-657-252-2 



Query Match 100.0%; Score 2972; DB 4; Length 580; 

Best Local Similarity 100.0%; Pred. No. 1.6e-281; 

Matches 58 0; Conservative 0; Mismatches 0; Indels 0; Gaps 0; 

Qy 1 MAFHVEGLIAIIVFYLLILLVGIWAAWRTKNSGSAEERSEAIIVGGRDIGLLVGGFTMTA 60 

| | | I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I 

Db i MAFHVEGLIAIIVFYLLILLVGIWAAWRTKNSGSAEERSEAIIVGGRDIGLLVGGFTMTA 60 

Qy 61 TWGGGYINGTAEAVTVPGYGLAWAQAPIGYSLSLILGGLFFAKPMRSKGYVTMLDPFQQ 12 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 61 TWVGGGYI NGTAEAVYVPG YGLAWAQAP I GYS LS LI LGGL FFAKPMRS KGYVTMLDP FQQ 120 

Qy 121 I YGKRMGGLLFI PALMGEMFWAAAI FSALGATI SVI I DVDMHI SVI I SALI ATLYTLVGG 180 

| | | | | | | | | | | I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I 

Db 121 I YGKRMGGLLFI PALMGEMFWAAAI FSALGATI SVI IDVDMHI SVI I SALIATLYTLVGG 18 0 

Qy 181 L Y S VAYT D WQL FC I FVGLW I S VP FAL S H P AVAD I G FT AVHAK YQK PWL GT VD S S EVYS W 240 

| M | M I I I I I I I I I I I II I II I I I I I I I I I I I I I II II I I I I I I I I I I I I I I I I I I I I I 
Db 181 L Y S VAYT DWQ L FC I FVG LWI S VP FAL S H PAVAD I G FT AVHAK YQ K PWL GT VD S S EVYS W 240 

Qy 241 LDS FLLLMLGGI PWQAYFQRVLS S S S AT YAQVLS FLAAFGCLVMAI PAI LI GAI GASTDW 300 

| | | | | | | I M I I I I I I I I I i I I I I I I I I I I I I M I I I I I ! I I I I I I I I I I I I I I I M I I I 
Db 241 LDS FLLLMLGGI PWQAYFQRVLS SS SAT YAQVLS FLAAFGCLVMAI PAI LI GAI GASTDW 300 

Qy 301 NQTAYGLPDPKTTEEADMILPIVLQYLCPVYISFFGLGAVSAAVMSSADSSILSASSMFA 360 

| | | | | | I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 301 NQTAYGLP DPKTTEEADMI LP I VLQ YLCPVYI S FFGLGAVSAAVMS STUDS S I LSAS SMFA 360 

Qy 361 RN I YQLS FRQNAS DKEI VWVMRI TVFVFGAS ATAMALLTKTVYGLWYL S S DLVYI VI FPQ 420 

| | | | | | | | | | | I I I I I I I M I I I I I I I I I I I I I M II I I I I I I I I I I I I I I I I I I 

Db 361 RN I YQLS FRQNAS DKEI VWVMRI TVFVFGAS AT AMALLT KTVYGLW YL S S DLVYI VI FPQ 420 

Q y 421 LLCVLFVKGTNTYGAVAGYVSGLFLRITGGEPYLYLQPLIFYPGYYPDDNGIYNQKFPFK 4 80 

| I I I I I I I I II I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I . 

Db 421 LLCVLFVKGTNTYGAVAGYVSGLFLRITGGEPYLYLQPLIFYPGYYPDDNGIYNQKFPFK 480 

Qy 481 T LAMVT SFLTNICIS YLAK YL FE S GT L P P KL DVFDAWARH S EENMD KT I LVKN EN I K L D 540 

| | M I I I I I I I I I I I I I I I I I I I I I I I I M I II I I I I I I I I 

Db 481 TLAMVTSFLTNICISYLAKYLFESGTLPPKLDVFDAWARHSEENMDKTILVKNENIKLD 540 

Qy 541 ELALVKPRQSMTLSSTFTNKEAFLDVDSSPEGSGTEDNLQ 580 

| | I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 541 ELALVKPRQSMTLSSTFTNKEAFLDVDSSPEGSGTEDNLQ 580 



RESULT 2 
US-07-841-651-4 

; Sequence 4, Application US/07841651 
; Patent No. 5410031 

GENERAL INFORMATION: 

APPLICANT: Pajor, Ana M 
; APPLICANT: Wright, Ernest M 

TITLE OF INVENTION: Cloning and Functional Expression of a 



TITLE OF INVENTION: Mammalian Na+/Nucleoside Cotransporter : A Member of 

the 

TITLE OF INVENTION: SGLT Family 
NUMBER OF SEQUENCES: 4 
CORRESPONDENCE ADDRESS : 

ADDRESSEE: Sheldon & Mak 

STREET: 225 South Lake Avenue, Ninth Floor 
CITY: Pasadena 
STATE: California 
COUNTRY : USA 
ZIP: 91101 
COMPUTER READABLE FORM: 

MEDIUM TYPE: Floppy disk 
COMPUTER: IBM PC compatible 
OPERATING SYSTEM: PC-DOS/MS-DOS 

SOFTWARE: Patentln Release #1.0, Version #1.25 
CURRENT APPLICATION DATA: 

APPLICATION NUMBER: US/07/841,651 
FILING DATE: 19920224 
CLASSIFICATION: 435 
ATTORNEY/AGENT INFORMATION: 
NAME: Mandel, SaraLynn 
REGISTRATION NUMBER: 31,853 
REFERENCE/ DOCKET NUMBER: 8772 
TELECOMMUNICATION INFORMATION: 
TELEPHONE: (818) 796-4000 
TELEFAX: (818) 795-6321 
INFORMATION FOR SEQ ID NO: 4: 
SEQUENCE CHARACTERISTICS: 
LENGTH: 662 amino acids 
TYPE: AMINO ACID 
TOPOLOGY: linear 
MOLECULE TYPE: protein 
HYPOTHETICAL: NO 
ORIGINAL SOURCE: 

ORGANISM: Oryctolagus cuniculus 
US-07-841-651-4 

Query Match 10.4%; Score 308.5; DB 1; Length 662; 

Best Local Similarity 23.4%; Pred. No. 3.2e-21; 

Matches 154; Conservative 110; Mismatches 238; Indels 155; Gaps 26; 

VFYLLILLVGIWAAWRTKNSGSAEERSEAIIVGGRDIGLLVGGFTMTATWVGGGYING 70 

: : : | : : : I I : I I : I I I : : I I : | : : I : : I I : I 

IYFLWMAVGLWAMFST-NRGTV GGFFLAGRSMVWWPIGASLFASNIGSGHFVG 8 6 

EAVYVP G YGLAWAQAP I G YS LSLILGGLFFAKPMRSKGYVTMLDPFQQI Y-GK 12 4 

Ml | | : : :: II : I : I : I I I I : I : : I I 



Qy 


11 


Db 


32 


Qy 


71 


Db 


87 


Qy 


125 


Db 


140 


Qy 


182 


Db 


198 



[GGLLFI PALMGEMFW — AAAI FSALGAT-I SVI I DVDMHI SVI I SALI ATLYTLVGGL 181 

| | : | : : I : I I I I II | : : : I : : : : : I I : I III: III 
QIYLSILSLLLYIFTKISADIFS — GAIFIQLTLGLDIYVAIIILLVITGLYTITGGL 197 

1VAYTDWQLFCI FVGLWI S VP FAL S H P AVAD I G FT AVHAK Y Q 225 

I I I I : I : I I I I I I : I II I 

vVT YTDTLOTAIMMVGSVI LTGFAFHEVG GYEAFT EKYMRAI P SQI S YGNT S I PQ 253 



Qy 



226 KPWLGTVDSSEVYSWLDSFLLLMLGGIPW QAYFQRVLS S S S A 267 

I : I : : I : I I I I I I I I I : : 

Db 254 KCYTPREDAFHI FRDAI T GD I PW P GLVFGM SILT LW YWCT DQVI VQ RC L S AKN L 307 



Qy 268 TYAQVLS FLAAFGCLVMAI PAl LI GAI GASTDWNQTAYGLPDP KTTEEADMI LP 321 

: : : | : : : : : : I : : : | : I : : I 

Db 308 SHVKAGCILCGYLKVMPMFLIVl^GMVSRILYTDKVACWPSECERYCGTRVGCTNIAFP 367 

Qy 322 I VLQ YLCPVYI S FFGLGAVS AAVMS SADS S I LSAS SMFARNI YQLS FRQNAS DKEI VWVM 381 

: : M : I : I : : I I I I I I I : : I : I I | : I I : I I : : 

D b 368 TLWELMPNGLRGLMLSVMMASLMSSLTSIFNSASTLFTMDIY-TKIRKKASEKELMIAG 426 

Qy 382 RI-TVFVFGASATAMALLTKTVYG— LWYLSSDLVYI— VIFPQLLCVLFVKGTNTYGAV 436 

| : : I : I I : : : I I : I I : I I =11 I It 

D b 427 RLFMLFLIGISIAWVPIVQSAQSGQLFDYIQSITSYLGPPIAAVFLLAIFWKRVNEPGAF 486 

Qy 437 AGYVSGLFLRI TG GEPYLYLQPLI FYPGYYPDDNGI Y 473 

I I I : I II I IN -I 
Db 487 WGLVLGFLIGISRMITEFAYGTGSCMEPSNCPTIICGVHYLYFAIILF 534 

Qy 474 NQKFPFKTLAMVTSFLTNICISYLAKYLFESGTLPPKLDVFDAWA-RHSEENMDKTILV 532 

| I : I : : I I : I : : : : I : I : I 
Db 535 VISIITWWSLFTKPI PDVHLYRLCWSLRNSKE 568 

Qy 533 KNENIKLD — ELALVKPRQSMTLSSTFTNKEAF LDVDSSPEGSGTED 577 

| | || I : : : I : I : I | | | | : : I : 

Db 569 --ERIDLDAGEEDIQEAPEEATDTEVPKKKKGFFRRAYDLFCGLDQDKGPKMTKEEE 623 



RESULT 3 

US-09-252-991A-24 099 

; Sequence 24099, Application US/09252991A 
; Patent No. 6551795 
; GENERAL INFORMATION: 

; APPLICANT: Marc J. Rubenfield et al . 

; TITLE OF INVENTION: NUCLEIC ACID AND AMINO ACID SEQUENCES RELATING TO 
PSEUDOMONAS 

; TITLE OF INVENTION: AERUGINOSA FOR DIAGNOSTICS AND THERAPEUTICS 
; FILE REFERENCE: 107196.136 

; CURRENT APPLICATION NUMBER: US/09/252 , 991A 

; CURRENT FILING DATE: 1999-02-18 

; PRIOR APPLICATION. NUMBER: US 60/074,788 

; PRIOR FILING DATE: 1998-02-18 

; PRIOR APPLICATION NUMBER: US 60/094,190 

; PRIOR FILING DATE: 1998-07-27 

; NUMBER OF SEQ ID NOS : 33142 

; SEQ ID NO 24099 

LENGTH: 4 94 
; TYPE: PRT 

; ORGANISM: Pseudomonas aeruginosa 

FEATURE : 
; NAME/ KEY: UNSURE 

LOCATION: (232) 

; OTHER INFORMATION: Identity of amino acid at the above locations are 
unknown . 

US-09-252-991A-24099 



Query Match 10.1%; Score 301; DB 4; Length 494; 

Best Local Similarity 24.9%; Pred. No. l.le-20; 

Matches 116; Conservative 82; Mismatches 207; Indels 60; Gaps 15; 

Qy 9 IAIIVFYLLILLVGI WAAWRTKNSGSAEERSEAI IVGGRDI GLLVGGF TMTAT 61 

:|: :| :|| |: I I |: I : :| ll::| II II II 

Db 32 MALDI FWLI YAAGMI ALGWYGMR RAKTRDD-YLVAGRNLG— PGFYLGTMAAT 82 

Qy 62 WVGGGYI NGTAELAVYVPGYGLAWAQAP I GYS LS LI LGGL FFAKPMRS KGYVTMLD P FQQ I 121 

: | | M III I III:: Mill: I : : = 

D b 83 VLGGASTIGTVRLGYVHGISGFWLCGAIG— LGIVGLSLFLAKPLLKLKIYTVTQVLERR 140 

Qy 122 YGKRMGGLLFI PALMGEMFWAAAI FSALGATI SVI I DVDMHI SVI I SALI ATLYTLVGGL 181 

| : I : : I | : | : | : : : I : : I : I I : : I I : 

Db 141 YN P AARHAS AL I MLVYALMI GAT S T I AI GT VMQVL FGL P FWVS I L I GGGVWL YS T I GGM 200 

Qy 182 Y S VAYT DWQ L FC I FVG L - W I S VP FAL S H PAVAD 1 GFTAVHAKYQKPWLG 230 

: I : M : I I : I I I : : : I : : : I MM I 
Db 2 01 WSLTLTDIVQFLIMTVGLVFLLMPLSINDAGXWDALVAKLPASYFDFTAI GW— 252 

Qy 231 TVDS SEVYSWLDS FLLLMLGGI PWQAYFQRVLS S S SAT YAQVLS FLAAFGCLVMAI PAI L 290 

| : | ||: I I : I I I :: I I I : I I M : : I 

Db 253 — DTIVTY FLIYFFGIFIGQDI WQRVFTAR S ET VAKVAG S AAG I YCVL YGMAGAL 305 

Qy 291 IGAIGASTDWNQTAYGLPDPKTTEEADMILPIVLQYLCPVYISFFGLGAVSAAVMSSADS 350 

|| Ml I : I : : : I I : I I M I M I : 

Db 306 IGMAAKVL LPD LENWNAFASWEHSLPNGIRGLVIAAALAALMSTASA 354 

Qy 351 SILSASSMFARNIY-QLSFRQNASDKEIVWMRITVFVFGASATAMALLTKTVYGLWYLS 409 

: | : I I : : : : : I : I I I II : I MM I : : 

Db 355 GLIAASTTVTQDLLPRLRRGRGQSDNGDVHENRIATLLLGLWLGIALWSDVISALTVA 414 

Qy 410 SDLVYI VI FPQLLCVLFVKGTNTYGAVA GYVSGLFLRITGG 450 

:|: : |: :: I MM |::: I I I 

Db 415 YNLLVGGMLIPLIGAI YWKRATTAGAITSMTLGFLTVLVFMIKDG 459 



RESULT 4 

US-10-162-012-27 

; Sequence 27, Application US/10162012 

; Patent No. 6682597 

; GENERAL INFORMATION: 

; APPLICANT: Curtis, Rory A.J. 

APPLICANT: Silos-Santiago, Inmaculada 
; APPLICANT: Gu, Wei 

; TITLE OF INVENTION: NOVEL HUMAN ION CHANNEL AND TRANSPORTER FAMILY MEMBERS 

; FILE REFERENCE: 10448-190001 

; CURRENT APPLICATION NUMBER: US/ 10/ 162 , 012 

; CURRENT FILING DATE: 2002-06-04 

; PRIOR APPLICATION NUMBER: US 60/209,845 

; PRIOR FILING DATE: 2000-06-06 

; PRIOR APPLICATION NUMBER: US 09/875,321 

; PRIOR FILING DATE: 2001-06-06 

; PRIOR APPLICATION NUMBER: PCT/US01/18340 

; PRIOR FILING DATE: 2001-06-06 

; PRIOR APPLICATION NUMBER: US 60/209,257 



; PRIOR FILING DATE: 2000-06-05 

; PRIOR APPLICATION NUMBER: US 09/875,423 

; PRIOR FILING DATE: 2001-06-05 

; PRIOR APPLICATION NUMBER: PCT/US01/18398 

; PRIOR FILING DATE : 2001-06-05 

; PRIOR APPLICATION NUMBER: US 60/209,238 

; PRIOR FILING DATE: 2000-06-05 

; PRIOR APPLICATION NUMBER: US 09/875,363 

; PRIOR FILING DATE: 2001-06-05 

; PRIOR APPLICATION NUMBER: PCT/US01/18247 

; PRIOR FILING DATE: 2001-06-05 

; PRIOR APPLICATION NUMBER: US 60/227,068 

; PRIOR FILING DATE: 2000-08-22 

; PRIOR APPLICATION NUMBER: US 09/928,530 

; PRIOR FILING DATE: 2001-08-13 

; PRIOR APPLICATION NUMBER: PCT/US01/25475 

; PRIOR FILING DATE: 2001-08-15 

; PRIOR APPLICATION NUMBER: US 60/226,770 

; PRIOR FILING DATE: 2000-08-21 

; PRIOR APPLICATION NUMBER: US 09/934,421 

; PRIOR FILING DATE: 2001-08-21 

; PRIOR APPLICATION NUMBER: PCT/US01/26096 

; PRIOR FILING DATE: 2001-08-21 

; PRIOR APPLICATION NUMBER: US 60/279,281 

; PRIOR FILING DATE: 2001-03-28 

; PRIOR APPLICATION NUMBER: US 10/109,029 

; PRIOR FILING DATE: 2002-03-28 

; PRIOR APPLICATION NUMBER: PCT/US02/09728 

; PRIOR FILING DATE: 2002-03-28 

; PRIOR APPLICATION NUMBER: US 60/290,288 

; PRIOR FILING DATE: 2001-05-11 

; PRIOR APPLICATION NUMBER: US (not assigned) 

; PRIOR FILING DATE: 2002-05-13 

; NUMBER OF SEQ ID NOS : 48 

SOFTWARE: FastSEQ for Windows Version 4.0 
; SEQ ID NO 27 ^ 
LENGTH: 675 
TYPE: PRT 

ORGANISM: Homo sapiens 
US-10-162-012-27 

Query Match 10.0%; Score 298.5; DB 4; Length 675; 

Best Local Similarity 22.7%; Pred. No. 3.2e-20; 

Matches 149; Conservative 115; Mismatches 238; Indels 155; Gaps 29 

Qy 2 AFHVEGL IAIIVFY-LLILLVGIWAAWRTKNSGSAEERSEAIIVGGRDIGLLVGGF 56 

II : II I I : : I I I : I I I : I : : I I : : I : II 

Db 18 AFPQKGLEPGDIAVLVLYFLFVLAVGLWSTVKTKR DTVKGYFLAEGNMVWWPVGA- 72 

Qy 57 TMTATWVGGG Y I NGT AEAVYVP G Y GLAWAQAP I G YS L S LILGGLFFAKPMRSKGY 111 

: : I : I I I : I I III : II: I : I : I I : I 

Db 73 SLFASNVGSGHFIGLA GSGAATGISVSAYELNGLFSVLMLAWIFL--PIYIAGQ 124 

Qy 112 VTMLDPFQQIYGKRMGGLLFIPALMGEMFWAAAIFSALGATI SVIID VDMHIS 164 

II : : : II II: II :: :: ||: : : : : I :|:::: 
Db 125 VTTMPEYLR KRFGGI R- 1 P 1 1 LAVLYLFI YI FTKI SVDMYAGAI FIQQS SHLDLYLA 180 



Qy 165 VI I SAL I AT L YT LVGGL Y S VAYT D WQL FC I FVGLWI S VP FAL S H P AVAD I G FT AVHAKY 224 

: : I : I I : I I I : I I II : I : : I : : I I I I : I I 
Db 181 I VGLLAI T AVYT VAGGLAAVI YT DALQT L IML I GALT LMG Y — S FAAVG — GME GL KEK Y 236 

Qy 225 QKPWLGTVDSSEVYS-WLDSFLLLMLGGI 252 

III: : I I 

Db 237 FLALASNRSENSSCGLPREDAFHIFRDPLTSDLPWPGVLFGMSIPSLWY 285 

Qy 253 PW QAY FQ RVL S S S S AT YAQVL S FLAAFG C LVMAI P AI L I GAI GAS T DWN QT AYG L P D 309 

I I | I | : : : : : | : : : | | : : : I : : I I I 

Db 286 - WCT DQ VI VQ RT LAAKN L S HAKGG ALMAAY L KVL P L F I MVF P GMVS RI L F P DQ VA — CAD 342 

Qy 310 PKTTEE ADMI LP I VLQYLCPVYI S FFGLGAVS AAVMS SADS S I LSAS SMFAR 361 

I : : : : I : I : : II : : : I I : I I I I I I I : : I 

Db 34 3 PEICQKICSNPSGCSDIAYPKLVLELLPTGLRGLMMAVMVAALMSSLTSIFNSASTIFTM 402 

Qy 362 NIYQLSFRQNASDKEIVWVMRITVFVFGASATAMALLTKTVYGLW Y 407 

: : : I I I : I I :: I I : I II III I 

Db 403 DLWN-HLRPRAS EKELMI VGRVFV LLLVLVS I LWT PWQASQGGQL FI Y 450 

Qy 408 LSSDLVYI VIFPQLLCVLFVKGTNTYGAVAGYVSGLFLRITG-GEPYLYLQPLIF 461 

: I I : I : I : I I I I II I : I I I I : : : I : I I 

Db 451 IQSISSYLQPPVAWF IMGCFWKRTNEKGAFWGLISGLLLGLVRLVLDFIYVQPRC- 506 

Qy 462 YPGYYPDDNGIYNQKFPFKTLAMVTSFLTNICISYLAKYLFESGTLPPKLDV 513 

II: : : : : I : I : I I : I : : : III : : 

Db 507 DQPDERPVLVKS IHYLYFSMI LSTVTLITVSTVSWF TEPPSKEMVSHLTWFT 558 

Qy 514 - FDAWARH S EENMDKT I LVKNEN I KLD ELALVKPRQSMTLSSTFTNKEA 562 

III: I : : I : : .: I : III II:: 

Db 559 RHDP WQKEQAP PAAP L S L T L S QNGMP EAS SSSS VQ FEMVQENT S KTH S C DMT P KQ S 615 



RESULT 5 

US-09-540-236-2193 

Sequence 2193, Application US/09540236 
Patent No. 6673910 
GENERAL INFORMATION: 
APPLICANT: Gary L. Breton et al. 

TITLE OF INVENTION: NUCLEIC ACID AND AMINO ACID SEQUENCES RELATING TO 
MO RAX EL LA CATARRHAL I S 

TITLE OF INVENTION: FOR DIAGNOSTICS AND THERAPEUTICS 
FILE REFERENCE: 27 09.2005-001 
CURRENT APPLICATION NUMBER: US/09/540, 236 
CURRENT FILING DATE: 2000-04-04 
NUMBER OF SEQ ID NOS : 3840 
SEQ ID NO 2193 
LENGTH: 521 
TYPE: PRT 

ORGANISM: M. catarrhalis 
US-09-540-236-2193 

Query Match 10.0%; Score 298; DB 4; Length 521; 

Best Local Similarity 23.4%; Pred. No. 2.4e-20; 

Matches 131; Conservative 103; Mismatches 212; Indels 114; Gaps 20; 



Qy 



9 IAIIVFYLLILLVGIWAAWRTKNSGSAEERSEAIIVGGRDIGLLVGGFTMTATWGGGYI 68 



I : : I : : : I : : : I I : I : : I I I : : I I I : : I : I : : I : 

Db 33 I S LAVYFI LMI AI GI YAYFKQKND IEGYMLGGRNLSPAVTALSAGASDMSGWLL 86 

Qy 69 NGTAEAVYVPGYGLAWAQAPIGYSLSLILGG LFFAKPMR SKGYVTMLDPFQ 119 

I Mill I M I M I I M : I I : I I 

Db 87 LG L P G YM YAS GWS I W I AL GLT I GAC AN Y L I VAP RL RVYT ELADN AVT L P D Y F S 140 

Qy 120 QI YGKRMGGLLFI PALMGEMF WAAAI FSALGATI S VI I DVDMHI SVI I SALIATLYT 176 

: : I : | : : : I : I | I I : : : : : : : | : | I 

Db 141 NRFHDKSHLLRIMSAWIILFFTVYTAASLVAGGKLFESSLNLSYSMGLWVTAGVWAYT 200 

Qy 177 LVGGL YS VAYTDWQLFCI FVGLWI S VP FALS HPAVAD I GFTAVHAKYQKPWLGTVDS S E 236 

I II M : II I I : : : I I : I : : : I : M 

Db 201 LFGGFLAVSLTDFVQGVIMLIAMLI VP WAFGE I GGVS EAMAI ATQTNT E 250 

Qy 237 VYSWLDSFLLLMLGGIPWQAY FQRVLS SS SAT YAQVLS FLAAFGCLV 283 

I : M : : ' : : M I I : I : I I I : I : : 

Db 251 VFNWMNG — VTVMGVISLMAWGFGYFGQPHIIVRFMAIRSVKDVPTAMVI GMGWMI 304 

Qy 284 MA-IPAILIGAIGASTDWNQTAYGLPDPKTTEEADMILPIVLQYLCPVYISFFGLGAVSA 342 

:: I |::M I : M I MM I : I I II I I |: I 

Db 305 LSLIGALMVGLAGIAY-VARTGIELKDPET IFLVFSQVLFHPLISGFLLAAILA 357 

Qy 343 AVMS SADS S I LS AS SMFARNI YQLS FRQNAS DKEI VWVMRI TVFVFGASATAMA 396 

I M I : I M II I M I M : I I : I : I I : I : I M 

Db 358 AIMS T I S S Q L L WS S S LT RD I YKL FL D KQAS EARQ VL I G RI S WL VAI I AI MLAG D S N S S 417 

Qy 397 LLTKTVYGLWYLSSDLVYIVIFPQLLCVLFVKGTNTYGAVAGYVSG LFLRITGG 450 

I : : I I : : I I I I I M I : I : : : I I 

Db 418 VLN LVS HAW AG FG AAFGPLVILSLMWKRMNRNGALAGMIVGALTVIIWVYGG 469 

Qy 451 EPYLYLQPLI FYPGYYPDDNGI YN — QKFPFKTLAMVTSFLTNICISYLAKYLFESGTLP 508 

I I I : : | | : I I I : I I : I : II 
Db 470 FEIGGQPANDAIYSILPGFAF SLVTTIAVSLM TAP 504 

Qy 509 PKLDVFDAWARHSEENMDK 528 

I : : I M M 

Db 505 PPVYIVQKF EDMEK 518 



RESULT 6 
US-07-841-651-2 

; Sequence 2, Application US/07841651 
; Patent No. 5410031 
; GENERAL INFORMATION: 

APPLICANT: Pajor, Ana M 

APPLICANT: Wright, Ernest M 

TITLE OF INVENTION: Cloning and Functional Expression of a 

TITLE OF INVENTION: Mammalian Na+/Nucleoside Cotransporter : A Member of 

the 

TITLE OF INVENTION: SGLT Family 
NUMBER OF SEQUENCES: 4 
CORRESPONDENCE ADDRESS: 

ADDRESSEE: Sheldon & Mak 

STREET: 225 South Lake Avenue, Ninth Floor 
CITY: Pasadena 
STATE: California 



COUNTRY: USA 
ZIP: 91101 
COMPUTER READABLE FORM: 

MEDIUM TYPE: Floppy disk 
COMPUTER: IBM PC compatible 
OPERATING SYSTEM: PC-DOS/MS-DOS 

SOFTWARE: Patentln Release #1.0, Version #1.25 
CURRENT APPLICATION DATA: 

APPLICATION NUMBER: US/ 07/841, 651 
FILING DATE: 19920224 
CLASSIFICATION: 435 
ATTORNEY/AGENT INFORMATION: 
NAME: Mandel, SaraLynn 
REGISTRATION NUMBER: 31,853 
REFERENCE/ DOCKET NUMBER: 8772 
TELECOMMUNICATION INFORMATION: 
TELEPHONE: (818) 796-4000 
TELEFAX: (818) 795-6321 
INFORMATION FOR SEQ ID NO: 2: 
SEQUENCE CHARACTERISTICS: 
LENGTH: 672 amino acids 
TYPE: AMINO ACID 
TOPOLOGY: linear 
MOLECULE TYPE: protein 
US-07-841-651-2 

Query Match 10.0%; Score 298; DB 1; Length 672; 

Best Local Similarity 25.0%; Pred. No. 3.5e-20; 

Matches 153; Conservative 89; Mismatches 232; Indels 138; Gaps 25; 

Qy 9 IAII-VFYLLILLVGIWAAWRTKNSGSAEERSEAIIVGGRDIGLLVGGFTMTATWVGGGY 67 

| | : | : : | | : : | | : | : I I I I : : I I = | : : | : : I I : 

Db 26 IAVIAAYFLLVI GVGLWSMCRT-NRGTV GGYFLAGRSMVWWPVGASLFASNIGSGH 80 

Qy 68 INGTAEAVYVPGYGLAWAQAP I G YS LS LI LGGLFFAKPMRS KGYVTMLD P FQQI YG 123 

|| Ml ||:: ::| I II : I =11 I 
Db 81 FVGLA GT GAAN G LAVAG FEWNAL FWL LL GWL FAP VYLT AGVI TM PQYLR 130 

Q y 124 KRMGG LLFI PALMGEMFWAAAT F — SAL GAT I SVI I DVDMH I SVI I S A 169 

M || I : I : : : I : I I I I I =111 
Db 131 KRFGGH RI RL YL S VL S L FL YI FT KI S VDMFS GAVFI QQALGWN I YASVIALL 182 

Qy 170 L I AT L YT LVGG L YS VAYT DWQ L FC I FVGLW I S VP FAL S H P AVAD I GFT AVHAK Y 224 

I : I I : I I I :: I I I I I I I I : I : I I : : : I I 

Db 183 GITMVYTVTGGLAALMYTDT VQTFVI IAGAFI LTGYAFHEVG GYSGLFDKYMGAMT 238 

Qy 225 QKPWLGTVDSSEVYSWLDSFLLL MLGGIPW QAYF 258 

: I : I : I I I I : I I : I : I I I 

Db 239 SLTVSEDPAVGNISSSCYRPRPDSYHLLRDPVTGDLPWPALLLGLTIVSGWYWCSDQVIV 298 

Qy 259 Q RVL S S S SAT YAQ VL S F LAAF GC L VMAI P AI L I G AI GAS T D WN QT AY G L P D P KT TE 314 

| | | : : I : : I : I : : I I : : I I : II 

Db 299 QRCLAGRNLTHIKAGCILCGYLKLTPMFLMVMPGMISRILYPDEVACVAPEVCKRVCGTE 358 

Qy 315 E — ADMI LPIVLQYLCPVYI S FFGLGAVSAAVMS SADS SI LSAS SMFARNI YQLS FRQNA 372 

: : : I : : I I : I : I I : I I I I I : I : : I : I I I I I 
Db 359 VGCSNIAYPRLWKLMPNGLRGLMLAVMLAALMS SLAS I FNS S STLFTMDI YTL- -RPRA 416 



Qy 373 SDKEI VWVMRITVFVFGASATAMALLTKTVYG LWYLS SDLVYIV — I FPQLLCVLFV 427 

: | : : | | : I I : . I : : I hi I : • : I I I 

Db 417 GEGELLLVGRLWVVFT VAVSVAWL PVVQAAQGGQLFD YI QS VS S YLAP PVSAVFVVALFV 476 

Qy 428 KGTNTYGAVAGWSGLFLRITGGEPYLYLQPLIFYPGYYPDDNGIYNQKFPFKTLAMV-- 485 

I I I I : I I : : I : I I I : I 

Db 477 PRVNEKGAFWGLI GGLLMGLARLI P EFSFGTGSCVRP 513 

Qy 4 86 TSFLTNICISYLAKYLFE-SG T L P - P KLDVFDAWA- RH S EENMD KT I 530 

: I I : I I I I I I III:: I : I I h I 
Db 514 SACPAFLCRVHYLYFAIVLFFCSGLLI I IVSLCTAPI PRKHLHRLVFSLRHSKE 567 

Qy 531 LVKNENIKLDEL 542 

: I : : I I I 
Db 568 — EREDLDADEL 577 



RESULT 7 
US-07-841-651-3 

; Sequence 3, Application US/07841651 

; Patent No. 5410031 

; GENERAL INFORMATION: 

APPLICANT: Pajor, Ana M 
; APPLICANT: Wright , Ernest M 

; TITLE OF INVENTION: Cloning and Functional Expression of a 

TITLE OF INVENTION: Mammalian Na+/Nucleoside Cotransporter : A Member of 

the 

TITLE OF INVENTION: SGLT Family 
NUMBER OF SEQUENCES: 4 
CORRESPONDENCE ADDRESS: 
; ADDRESSEE: Sheldon & Mak 

; STREET: 225 South Lake Avenue , Ninth Floor 

; CITY: Pasadena 

STATE: California 
; COUNTRY: USA 

; ZIP: 91101 

; COMPUTER READABLE FORM: 

MEDIUM TYPE: Floppy disk 

COMPUTER: IBM PC compatible 
; OPERATING SYSTEM: PC-DOS/MS-DOS 

; SOFTWARE: Patentln Release #1.0, Version #1.25 

CURRENT APPLICATION DATA: 

APPLICATION NUMBER: US/07/841, 651 

FILING DATE: 19920224 
; CLASSIFICATION: 435 

; ATTORNEY/AGENT INFORMATION: 

NAME: Mandel, SaraLynn 

REGISTRATION NUMBER: 31,853 

REFERENCE/ DOCKET NUMBER: 8772 
; TELECOMMUNICATION INFORMATION: 

TELEPHONE: (818) 796-4000 

TELEFAX: (818) 795-6321 
; INFORMATION FOR SEQ ID NO: 3: 

SEQUENCE CHARACTERISTICS: 
; LENGTH: 672 amino acids 

TYPE: AMINO ACID 



TOPOLOGY: linear 
MOLECULE TYPE: protein 
HYPOTHETICAL: NO 
ORIGINAL SOURCE: 

ORGANISM: Oryctolagus cuniculus 
US-07-841-651-3 

Query Match 10.0%; Score 298; DB 1; Length 672; 

Best Local Similarity 25.0%; Pred. No. 3.5e-20; 

Matches 153; Conservative 89; Mismatches 232; Indels 138; Gaps 25; 

Qy 9 IAII-VFYLLILLVGIWAAWRTKNSGSAEERSEAIIVGGRDIGLLVGGFTMTATWVGGGY 67 

I I : I :: I I : : | I : I : Mil: : | | : I : : I : : I I : 

Db 26 IAVIAAYFLLVIGVGLWSMCRT-NRGTV GG Y FLAGR SMVWW P VGAS L FAS N I G S GH 8 0 

Qy 68 INGTAEAVYVPGYGLAWAQAPIGYSLS LILGGLFFAKPMRSKGYVTMLDPFQQIYG 123 

I I Mill:: : : I I I I : I : I I I 
Db 81 FVGLA GT GAANGLAVAG FEWNALFWLLLGWL FAP VYLT AGVI TM PQYLR 130 

Qy 124 KRMGG LLFI PALMGEMFWAAAI F — SALGATISVIIDVDMHISVII SA 169 

I I II I : I : : : I : I III I : Ml 

Db 131 KRFGGHRI RLYLSVLSLFLYIFTKISVDMFSGAVFIQQALGWNI YASVIALL 182 

Qy 170 L I AT L YT L VG GL Y S VAYT DWQ L FC I FVGLW I S VP FAL S H P AVAD I G FT AVHAK Y 224 

I : I I : I I I : : I I I I I I I I : I : I I : : : II 

Db 183 G I TMVYT VT GGLAALMYT DT VQT FVI I AGAF I LT G YAFH EVG GYSGLFDKYMGAMT 238 

Qy 225 — QKPWLGTVDSSEVYSWLDSFLLL MLGGIPW QAYF 258 

: I : I : I I I I : I I : I : I I I 

Db 239 SLTVSEDPAVGNISSSCYRPRPDSYHLLRDPVTGDLPWPALLLGLTIVSGWYWCSDQVIV 298 

Qy 259 Q RVL S S S SAT YAQVL S FLAAFGC LVMAI P AI L I GAI GAS T DWNQTAY GL P D P KT TE 314 

III::):: I : I :: I I :: I |: II 

Db 299 QRCLAGRNLTHIKAGCILCGYLKLTPMFLMVMPGMISRILYPDEVACVAPEVCKRVCGTE 358 

Qy 315 E — ADMI LP I VLQYLCPVYI S FFGLGAVSAAVMS S ADS S I LSAS SMFARN I YQLS FRQNA 372 

: : : I :: II : I : I I : I I I I I : I : : I : I I I I I 
Db 359 VGCSNIAYPRLWKLMPNGLRGLMLAWILAALMSSLASIFNSSSTLFTMDIYTL--RPRA 416 

Qy 373 S DKE I VWVMRI T VFVFGAS AT AMAL LT KT VYG LWYLSSDLVYIV — IFPQLLCVLFV 427 

: I :: I I : I I : I : : I I : I I : : : I I I 

Db 417 GEGELLLVGRLWWFIVAVSVAWLPWQAAQGGQLFDYIQSVSSYLAPPVSAVFWALFV 476 

Qy 428 KGTNTYGAVAGYVSGLFLRITGGEPYLYLQPLIFYPGYYPDDNGIYNQKFPFKTLAMV— 485 

I II I : I I : : I : I I I : I 

Db 477 PRVNEKGAFWGLIGGLLMGLARLIP EFSFGTGSCVRP 513 

Qy 486 TSFLTNICISYLAKYLFE-SG TLP-PKLDVFDAWA-RHSEENMDKTI 530 

: I I : II II II III:: I : I I I : I 

Db 514 SACPAFLCRVHYLYFAIVLFFCSGLLIIIVSLCTAPIPRKHLHRLVFSLRHSKE 567 

Qy 531 LVKNENI KLDEL 542 

: I :: III 
Db 568 — EREDLDADEL 577 



RESULT 8 



US-10-162-012-30 

; Sequence 30, Application US/10162012 
; Patent No. 6682597 
; GENERAL INFORMATION: 

APPLICANT: Curtis, Rory A.J. 
; APPLICANT: Silos-Santiago, Inmaculada 
; APPLICANT: Gu, Wei 

; TITLE OF INVENTION: NOVEL HUMAN ION CHANNEL AND TRANSPORTER FAMILY MEMBERS 

; FILE REFERENCE: 10448-190001 

; CURRENT APPLICATION NUMBER: US/ 10/ 1 62 , 012 

; CURRENT FILING DATE: 2002-06-04 

; PRIOR APPLICATION NUMBER: US 60/209,845 

; PRIOR FILING DATE: 2000-06-06 

; PRIOR APPLICATION NUMBER: US 09/875,321 

; PRIOR FILING DATE: 2001-06-06 

; PRIOR APPLICATION NUMBER: PCT/US01/ 18340 

; PRIOR FILING DATE: 2001-06-06 

; PRIOR APPLICATION NUMBER: US 60/209,257 

; PRIOR FILING DATE: 2000-06-05 

; PRIOR APPLICATION NUMBER: US 09/875,423 

; PRIOR FILING DATE: 2001-06-05 

; PRIOR APPLICATION NUMBER: PCT/US01/ 18398 

; PRIOR FILING DATE: 2001-06-05 

; PRIOR APPLICATION NUMBER: US 60/209,238 

; PRIOR FILING DATE: 2000-06-05 

; PRIOR APPLICATION NUMBER: US 09/875,363 

; PRIOR FILING DATE: 2001-06-05 

; PRIOR APPLICATION NUMBER: PCT/US01/18247 

; PRIOR FILING DATE: 2001-06-05 

; PRIOR APPLICATION NUMBER: US 60/227,068 

; PRIOR FILING DATE: 2000-08-22 

; PRIOR APPLICATION NUMBER: US 09/928,530 

; PRIOR FILING DATE: 2001-08-13 

; PRIOR APPLICATION NUMBER: PCT/US01/25475 

; PRIOR FILING DATE: 2001-08-15 

; PRIOR APPLICATION NUMBER: US 60/226,770 

; PRIOR FILING DATE: 2000-08-21 

; PRIOR APPLICATION NUMBER: US 09/934,421 

; PRIOR FILING DATE: 2001-08-21 

; PRIOR APPLICATION NUMBER: PCT/US01/2 6096 

; PRIOR FILING DATE: 2001-08-21 

; PRIOR APPLICATION NUMBER: US 60/279,281 

; PRIOR FILING DATE: 2001-03-28 

; PRIOR APPLICATION NUMBER: US 10/109,029 

; PRIOR FILING DATE: 2002-03-28 

; PRIOR APPLICATION NUMBER: PCT/US02/09728 

; PRIOR FILING DATE: 2002-03-28 

; PRIOR APPLICATION NUMBER: US 60/290,288 

; PRIOR FILING DATE: 2001-05-11 

PRIOR APPLICATION NUMBER: US (not assigned) 
; PRIOR FILING DATE: 2002-05-13 
; NUMBER OF SEQ ID NOS : 48 

SOFTWARE: FastSEQ for Windows Version 4.0 
; SEQ ID NO 30 

LENGTH: 672 

TYPE: PRT 

ORGANISM: Homo sapiens 



US-10-162-012-30 



Query Match 9.8%; Score 292; DB 4; Length 672; 

Best Local Similarity 24.1%; Pred. No. 1.4e-19; 

Matches 147; Conservative 91; Mismatches 237; Indels 136; Gaps 22; 

Qy 8 LIAIIVFYLLILLVGIWAAWRTKNSGSAEERSEAIIVGGRDIGLLVGGFTMTATWVGGGY 67 

: : I : : II : : I I : I : II I h : I I : | : : I : : I I : 

Db 26 ILVIAAYFLLVIGVGLWSMCRT-NRGTV GGYFLAGRSMVWWPVGAS LFASN I GS GH 8 0 

Qy 68 I N GT AEAVYVP G YGLAWAQAP I G Y S L S LI LGGLFFAKPMRS KG YVTMLDP FQQI YG 123 

|| Ml II:: : : I I I I : I : I I I 
Db 81 FVGLA GT GAAS GLAVAG F EWNAL FWL LLGWL FAP VYLT AGVT TM PQYLR 130 

Qy 124 KRMGG LLFI PALMGEMFWAAAI F- - SAL GAT I SVI I DVDMHI SVI I SA 169 

| | | | | : | : : : I : I I I I I : I I I 

Db 131 KRFGGRRIRLYLSVLSLFLYIFTKISVDMFSGAVFIQQALGWNI YASVIALL 182 

Qy 170 L I AT L YT L VGGL Y S VAYT DWQL FC I FVGLWI S VP FAL S H P AVAD I G FT AVHAK Y 224 

I : | I : I I I : : I I I I I I I I I : : I I : : : I I 

Db 183 GITMIYTVTGGLAALMYTDTVQTFVILGGACILMGYAFHEVG GYSGLFDKYLGAAT 238 



QY 



225 QKPWLGTVDSSEVYSWLDSFLLL MLGGIPW QAYF 258 

: I : I : I I I : I I : I : I I I 

Db 239 SLTVSEDPAVGNISSFCYRPRPDSYHLLRHPVTGDLPWPALLLGLTIVSGWYWCSDQVIV 298 



Qy 259 QRVL S S S S AT YAQVL S FLAAFGC LVMAI P AI LI GAI GAS T DWNQT AYGL PD P KT TE 314 

| | | : I I : : I : I : : I I : : I : I : II 

Db 299 QRCLAGKSLTHI KAGCI LCGYLKLTPMFLMVMPGMI SRI LYPDEVACWPEVCRRVCGTE 358 

Qy 315 e — ADMI LP I VLQYLCPVYI S FFGLGAVSAAVMS SADS S I LSAS SMFARNI YQLS FRQNA 372 

::: | :: | | : I : ||:||l I =11 I I 

Db 359 VGCSNIAYPRLWKLMPNGLRGLMLAWIIAAmSSIASIFNSSSTLFTMDIY-TRLRP^ 417 

Qy 373 SDKEIVWVMRI-TVFVFGASATAMALLTKTVYGLWYLSSDLVYIVIFPQLLCV LFV 427 

| : I : : I I : I I : I : : : I : I =1=1 III 

Db 418 GDRELLLVGRLWWFIVWSVAWLPWQAAQGGQLFDYIQAVSSYIJVPPVSAVFVLALFV 47 7 

Qy 428 KGTNTYGAVAGYVSGLFLRITGGEPYLYLQPLIFYPGYYPDDNGIYNQKFPFKTLAMV 48 5 

| | | | : | | : : | : | | : : I 

Db 478 PRVNEQGAFWGLIGGLLMGLARLIP EFSFGSGSCVQP 514 

Qy 486 TSFLTNICISYLAKYLFE-SGTLPPKLDVFDAW ARHSEENMDKTI 530 

: M : I I I I I I I : : I : II I : I 

Db 515 SACPAFLCGVHYLYFAIVLFFCSGLLTLTVSLCTAPIPRKHLHRLVFSLRHSKE 568 

Qy 531 LVKNENIKLDE 541 

: I : : II 
Db 569 --EREDLDADE 577 



RESULT 9 

US-09-543-681A-4994 

; Sequence 4994, Application US/09543681A 

; Patent No. 6605709 

; GENERAL INFORMATION: 

; APPLICANT: GARY BRETON 



; TITLE OF INVENTION: NUCLEIC ACID AND AMINO ACID SEQUENCES RELATING TO PROTEUS 
MIRABILIS FOR 

TITLE OF INVENTION: DIAGNOSTICS AND THERAPEUTICS 
FILE REFERENCE: 2709.1002-001 
CURRENT APPLICATION NUMBER: US/09/543, 681A 
CURRENT FILING DATE: 2000-04-05 
PRIOR APPLICATION NUMBER: US 60/128,706 
PRIOR FILING DATE: 1999-04-09 
NUMBER OF SEQ ID NOS : 8344 
SEQ ID NO 4994 
LENGTH: 548 
TYPE: PRT 

ORGANISM: Proteus mirabilis 
US-09-54 3-68 1A-4 994 

Query Match 9.3%; Score 277.5; DB 4; Length 548; 

Best Local Similarity 24.3%; Pred. No. 2.6e-18; 

Matches 144; Conservative 95; Mismatches 229; Indels 125; Gaps 23; 

Qy 4 HVEGL IAIIVFYLLILL-VGIWAAWRTKNSGSAEERSEAIIVGGRDIGLLVGGF 56 

|||: || | : : | : : : I : I : : : I : : : : : I : : I 

Db 6 HTEGVGLSTI DYAI FALYVI 1 1 1 S LGLWV SRSKDGAKKGTKDYFLAGKTLPWWAIGS 62 

Qy 57 TMTATWV GG G Y I N GT AEAVYVP G YGLAW AQ AP I G Y SLSLILGGLF FAK PMR 107 

: : | : I I : I I I I II | : I I : : I 

Db 63 SLIAANISAEQFIGMSGSGFSIGLAIASY EWMAA LTLIIVAKYFLPIFI 111 

Qy 108 SKGYVTMLDPFQQIYGKRMGGLLFIPALMGEMFWAAA-IFSAL GATISVIIDV 159 

| | | : : : : I III ill I II I : I : I 

D b 112 EKGIYTIPEFVENRFKSR--NLKTILA VFWLALFIFVNLTSVLYLGSLALETILGV 165 

Qy 160 DMHI SVII SALIATLYTLVGGLYSVAYTDWQLFCI FVGLWI SVPFALSH 209 

| : : | | | | : I : I I I I : I I : I I I I I : I : : I : I : I : 
Db 166 PMMYAI I GLAL FAVI Y S L YGGL S AVAWT D WQVF FL I LG G L FT T VLAVS Y I GGD GG I MEG 225 

Q y 210 P AVAD I G FT AVHAK YQ K P WLGT VD S S E VY SWLDSFLL LML GG I P W Q 255 

I I I : II :: : :::||: I I 

Db 226 LSKMTAAAPDHFKMILAKENPQFMNLPG IAVLIGGL-WVANLYYWGFNQ 273 

Qy 256 AYFQRVLS S S S AT YAQVLS FLAAFGCLVMAI PAI L — I GAI GASTDWNQTAYGL P 308 

I I I : : I II III I : : I : : II : I I I I I 

Db 274 YIIQRALAAKSINEAQKGLVFAAFLKLIVPILVWPGIAAFVITTDPTLMA-GLGTMAQE 332 

Qy 309 DPKTT EEADMI LPI VLQYLCPVYI S FFGLGAVSAAVMS SADS S I LSAS SMFARNI YQLS F 368 

| : | | I : I : I I : | : : | | : : | | | : | : : : I : I I : 

Db 333 HIPTLAQADKAYPWLTQFL-PIGAKGWFAALAAAIVSSLASMLNSIATIFTMDIYKEYI 391 

Qy 369 RQNASDKEIVWVMRITVFVFGASATAMALLTKTVYGLWYLSSDLVYI VIFPQLLC 423 

:|: :| I II: : I :| I I = I I I :l 

Db 392 GPKSSETRLVNVGRI SAVIALI IACFIAPL LGGIDQAFQYIQEYTGLVSPGILA 445 

Qy 424 V LFVKGTNTYGAVAGYVSGLFLRITGGEPYLYLQPLIFYPGYYPDDNGIYNQKFPF 47 9 

I II I II II: I I : ! : I I I I M 
Db 446 VFLLGLFWKKTNAKGAI I GWLS I PFAL FLKLMPL GMPF 484 

Qy 480 KTLAMWSFLTNICISYLAKYLFESGTLPPKLDVFDAWARHSEENMDKTILV 532 

|| | : | : : : I : : I I I I : I : : 



Db 485 L DQMMYT F I FT AWT GLVS LT S T KS DD S VGAI VLT DAT FKTQ S G FN I AS Y 1 1 M 537 



RESULT 10 

US-09-543-681A-6886 

; Sequence 6886, Application US/09543681A 

; Patent No. 6605709 

; GENERAL INFORMATION: 

; APPLICANT: GARY BRETON 

; TITLE OF INVENTION: NUCLEIC ACID AND AMINO ACID SEQUENCES RELATING TO PROTEUS 
MIRABILIS FOR 

; TITLE OF INVENTION: DIAGNOSTICS AND THERAPEUTICS 

; FILE REFERENCE: 270 9.1002-001 

; CURRENT APPLICATION NUMBER: US/09/543, 681A 

; CURRENT FILING DATE: 2000-04-05 

; PRIOR APPLICATION NUMBER: US 60/128,706 

; PRIOR FILING DATE: 1999-04-09 

; NUMBER OF SEQ ID NOS: 8344 

; SEQ ID NO 6886 

LENGTH: 554 

TYPE: PRT 

ORGANISM: Proteus mirabilis 
US-09-543-681A-6886 

Query Match 9.2%; Score 274.5/ DB 4; Length 554; 

Best Local Similarity 23.0%; Pred. No. 5.1e-18; 

Matches 125; Conservative 96; Mismatches 214; Indels 109; Gaps 24; 

Qy 6 EGLIAIIVFYLLILLVGIWAAWRTKNSGSAEERSEAIIVGGRDIGL LVGGFTMTA 60 

: :| ::| I I : ||: II:: ||: II: I : I I I 

Db 38 QAIIMFLIFVGLTLYITYWASKRTRS RSDYYTAGGKITGFQNGMAIAGDFMSAA 91 

Qy 61 TWVGGGYI NGTAEAVYVPGY - GLAWAQAP IGYSLSLILGG LFFAKPMRSKGYVTML 115 

::: I : II II II II: ::| I: :|: I I 

Db 92 SFL GISALVYTSGYDGLI YSIGFLIGWPIILFIIAERLRNLGRYTFA 138 

Qy 116 DPFQ-QI YGKRMGGLLFI PALMGEMFWAAAI FSALGATI SVI I DVDMHI SVI I SALIATL 174 

I : : I : I I : I : : I I I : : : : I I : I I : : : I 

Db 139 DWS YRL S P KP I RT L S AI GS LVWAL YL I AQMVGAGKL I ELL FGLN YH I AVI LVGI LMVL 198 

Qy 175 YTLVGGLYSVAYTDWQLFCI FVGLWISVPFALSHPAVADI GFTAVHAK YQKPWL- 229 

Mil:::::: : I : III:: :M : 

Db 199 YVLFGGMLATTWVQ 1 1 KAI LLLAGAT FMAVMVMK AADFNFNTLFKEAVNVHQKGFSI 255 

Qy 230 GTVDS S EVYSWLDS FLLLMLG — GI PWQAYFQRVLS S S SAT Y 269 

II I : I I I II I hi IMI I 

Db 256 MS PGGLV — SDPI SALSLGLALMFGTAGLP — HI IMRFFTVSDAKEARKSVFYATGFI GY 311 

Qy 270 AQVLS FLAAFGCLVMAI PAI L I GAIGASTDWNQTAYGLPDPKTTEEADMILPIVLQ 325 

:|:|: II ::: I I II : I I I II 
Db 312 FYILTFIIGFGAILLVSPNPLFKDAAGALIGGT- -NMAAVHLAD- 353 

Qy 326 YLCPVYI S FFGLGAVS AAVMS SADS S I LSAS SMFARNI YQLS FRQ-NAS DKEI VWV 380 

I : I I I I : I I : : : I : : I : : : I : I : : : : I 

Db 354 AVGGNFF-LGFI SAVAFATI LAWAGLTLAGASAVSHDLYANVI KNGQADERQELKV 409 



Qy 



381 MRI TVFVFGASATAMALL — TKTVYGLWYLS SDLVYI VI FPQLLC VLFVKGTNT YGAVAG 438 



:IM : I I : :| : : : |: : II :| :: II Mill 

Db 410 SKITWILGIVAIGLGILFEKQNIAFMVGLAFSIAASCNFPIILLSMYWKGLTTRGAVIG 469 

Qy 439 WSGLFLRITGGEPYLYLQPLIFYPGYYPDDNGIYNQKFPFKTLAMVTSFLTNICISYLA 498 

I I I : : I : | I | : I II ::| I : : I : : : 

Db 470 GWSGLIVAVT LMILGPTI-WVSILGHDTPIYPYEYP ALFSMIIAFIV 515 

Qy 499 KYLF 502 

: II 

Db 516 SWLF 519 



RESULT 11 
US-09-657-960-3 

; Sequence 3, Application US/09657960 

; Patent No. 6649342 

/ GENERAL INFORMATION; 

; APPLICANT: Mack, David 

; APPLICANT: Gish, Kurt 

; TITLE OF INVENTION: NOVEL METHODS OF DIAGNOSING BREAST CANCER, COMPOSITIONS, 
AND METHODS OF 

; TITLE OF INVENTION: SCREENING FOR BREAST CANCER MODULATORS 

; FILE REFERENCE: A-69196/DJB/ JJD 

; CURRENT APPLICATION NUMBER: US/09/657,960 

; CURRENT FILING DATE: 2000-09-08 

; PRIOR APPLICATION NUMBER: US 09/525,361 

; PRIOR FILING DATE: 2000-03-15 

; PRIOR APPLICATION NUMBER: US 09/453,137 

; PRIOR FILING DATE: 1999-12-02 

; PRIOR APPLICATION NUMBER: US 09/450,810 

; PRIOR FILING DATE: 1999-11-29 

; PRIOR APPLICATION NUMBER: US 09/268,865 

; PRIOR FILING DATE: 1999-03-15 

; PRIOR APPLICATION NUMBER: PCT/US 00/06952 

; PRIOR FILING DATE: 2000-03-15 

; NUMBER OF SEQ ID NOS : 4 

; SOFTWARE: Patentln version 3.0 

; SEQ ID NO 3 

LENGTH: 718 
; TYPE: PRT 

ORGANISM: Homo sapiens 
US-09-657-960-3 

Query Match 9.2%; Score 272.5; DB 4; Length 718; 

Best Local Similarity 22.3%; Pred. No. 1.2e-17; * 

Matches 152; Conservative 113; Mismatches 241; Indels 177; Gaps 30; 

Qy 9 IAII-VFYLLILLVGIWAAWRTKNSGSAEERSEAIIVGGRDIGLLVGGFTMTATWVGGG- 66 

III: ::::):: : I : I I : : I : I : I : I I : | 

Db 10 IAIVALYFILVMCIGFFAMWKSNRSTVS GYFLAGRSMTWVT I GAS L 55 

Qy 67 YINGTAEAVYVPGYGLAWAQAP I GYS LSLILGGLFFAKPMRSKGYVTMLD 116 

: : : : : I I I : I I : : I : I I : I : I I I II 

Db 56 FVSNIGSEHFI G LAG S GAAS G FAVGAWE FNAL L L LQ L L GWVFI P I Y I RS - GVYTM — 109 

Qy 117 PFQQIYGKRMGG LLFI PALMGEMFWAAAIFSALGATI SVI IDVDMHI S 164 

I I I I : I : I : : : I : I I : : : : : I 



Db 



110 



— PEYLSKRFGGHRIQVYFAALSLI LYI FTKLSVDLYSGALF IQESLGWNLYVS 161 



Qy 165 VI I SALIATLYTLVGGLYSVAYTDWQLFCI FVGLWI SVPFALSHPAVADI-GFTAVHAK 223 

||: : I I : I I I : I I I I : I : :| I : — I II I = 

Db 162 VI LLI GMTALLTVTGGLVAVI YTDTLQALLMI IG ALTLMI I S IMEI GGFEEVKRR 216 

Q y 224 YQKPWLGTVDS S EVYSWLDS F LLLMLGG IPW 254 

| : | : | | | : : III :M 

Db 217 YM LASPDVTSILLTYNLSNTNSCNVSPKKEALKMLRNPTDEDVPWPGFILGQTP 270 

Q y 255 QAYFQRVLS S S SATYAQ VLS FLAAFGCLVMAI PAI L IGA 293 

I | | | I : : : : I : : I I : : : I : : I 

Db 2 71 ASWYWCADQVIVQRVIAAKNIAHAKGSTLMAGFLKLLPMFIIVVPGMISRILFTDDIAC 330 

Qy 2 94 I GASTDWNQTAYGLPDPKTTEEADMILPIVLQYLCPVYISFFGLGAVSAAV 344 

| | : : | | | : : | I I : : : I I : 

Db 331 INPEHCMLVCGSRAGCSNIAY PRLVMKLVPVGLRGLMMAVMIAAL 375 

Qy 345 MS SADS S I LSAS SMFARNI YQLS FRQNASDKEI VWVMRITV- FVFGASATAMALLTKTVY 403 

|| || I I I : : I : : I : I I : : I I : I : : I I I I I ' I : : : : 
Db 376 MSDLDS I FNS AST I FTLDVYKL- 1 RKSAS SRELMIVGRI FVAFMWI S IAWVPI I VEMQG 434 

Qy 4 04 GLWYLSSDLVYIVIFPQL LCVLFVKGTNT YGAVAGYVSGLFLRITGGEPYLY 455 

III | : I : 1:111 I I : I I : I I I : I 

Db 435 GQMYLYI QEVADYLT P PVAALFLLAI FWKRCNEQGAFYGGMAGFVLGAVRLI LA FAY 491 

Qy 456 LQPLIFY PGYYPDDNGIYNQKFPFKTLAMVT SFLT NICISYLAKY 500 

| | | : | : : | I : : I I I I : I 

Db 492 RAPECDQPDNRPGFIKDIHYMYVATGLFWVTGLITVIVSLLTPPPTKEQIRTTTFWSKKN 551 

Q y 501 L FE S GT L P P KLDVF DAWARH S E ENMD KT I LVKN ENIK LDELALVKPRQ 54 9 

| II:: : : I I I I : : I : : I I I III: 

Db 552 LWKENCSPKEEPYQMQEKSILRCSENNETINHIIPNGKSEDSIKGLQPEDVNLLVTCRE 611 

Qy 550 SMTLSSTFTNKEAFLDVDSSPEG 572 

:: : II II: I 
Db 612 EGN P VAS LGH S EAET P VDAY S N G 634 



RESULT 12 

US-09-134-001C-4744 

; Sequence 4744, Application US/09134001C 
; Patent No. 6380370 
; GENERAL INFORMATION: 

; APPLICANT: Lynn Douce t te- St amm et al 

; TITLE OF INVENTION: NUCLEIC ACID AND AMINO ACID SEQUENCES RELATING TO 
STAPHYLOCOCCUS 

; TITLE OF INVENTION: EPIDERMIDIS FOR DIAGNOSTICS AND THERAPEUTICS 
; FILE REFERENCE: GTC-007 

; CURRENT APPLICATION NUMBER: US/09/ 134 , 001C 
; CURRENT FILING DATE: 1998-08-13 

PRIOR APPLICATION NUMBER: US 60/064,964 

PRIOR FILING DATE: 1997-11-08 
; PRIOR APPLICATION NUMBER: US 60/055,779 
; PRIOR FILING DATE: 1997-08-14 
; NUMBER OF SEQ ID NOS : 5674 
; SEQ ID NO 4744 



LENGTH: 518 
TYPE: PRT 

ORGANISM: Staphylococcus epidermidis 
US-09-134-001C-4744 

Query Match 8.8%; Score 262.5; DB 4; Length 518; 

Best Local Similarity 22.2%; Pred. No. 6.9e-17; 

Matches 126; Conservative 102; Mismatches 223; Indels 117; Gaps 25; 

Qy 9 IAIIVFYLLILLVGIWAAWRTKNSGSAEERSEAIIVGGRDIGLLVGGFTMTATWVGGGYI 68 

: I I I : : : : : I : : I : : I : I : : I I I I I : : I : : I I 

Db 27 VMIIVYFIILLIIGFY-GYRQATGNLSE FMLGGRSIGPYITALSAGASDMSGWMI 80 

Qy 69 NGTAEAVYVP GYGLAWAQAP I GYS LSLILGGL — F FAK PMRS KGY VTMLDPFQ 119 

I : I I I I : : I I I : I I : I : I : I I : 

Db 81 MGLPGSVYSTGLSAIW ITIGLTLGAYINYFVVAPRLRVYTEIAGDAITLPDFFK 134 

Qy 120 QI YGKRMGGLLFI PALMGEMFWAAAI FSAL GAT I SVI I DVDMHI SVI I SALIATLYT 176 

: : I I : : | : | | : : | : : | I : I I | 

Db 135 NRLDDKKNI I KI I SGLI I WFFTLYTHSGFVSGGKLFESAFGLNYHAGLLI VAI IVI FYT 194 

Qy 177 L VGG L Y S VAYT DWQ L FC I FVG L W I S VP FAL S H P AVAD I - G FT AVHAK YQ - K P W 228 

II : I : I I I : : : : I | I : : I : I III 

Db 195 FFGGYLAVSITDFFQGVIMLIAM-VMVPIV ALLKLNGWDTFHDIAQMKPTNLDLFR 249 

Qy 229 LGTVDS S EVYSWLDS FLLLMLGGI PWQAYFQRVLS S S SAT YAQVLS FLAAFGCLVM 284 

III : : I I I : : : : I . : : I II 

Db 250 GTTVLGIV SLFSW GLGYFGQPHIIVRFMSIKSHKLLPKARRLGISWM 296 

Qy 285 AIPAILIGAIGASTDW NQTAYGLPDPKTTEEADMILPIVLQYLCPVYISFFGLGAV 340 

I : hllll : : I I I : I : : : I I : III: 

Db 297 AVG— LLGAIGVGLTGISFISERHIKLEDPET LFIVMSQILFHPLVGGFLLAAI 348 

Qy 341 S AAVMS SAD S S I L S AS SMFARN I YQL S FRQNAS DKEI VWVMRI TVFVFGASATAMAL 397 

I I : II : I : I II : I : I I : : : I I I : I : : I : I : I 
Db 349 LAAIMSTISSQLLVTSSSLTEDFYKLIRGSDKASSHQKEFVLIGRLSVLLVAIVAITIA- 407 

Qy 398 LTKTVYGLWYLSSDLVYIV 1 FPQLLCVLFVKGTNTYGAVAGYVSGLFLRI 447 

I : : : : : I I : I I : I I I : : I I : I : I 

Db 408 WH PN DT I LN LVGNAWAG FGAAF S P LVL Y S L YWKD LT RAGAI S GMVAGAVWI 459 

Qy 448 TGGEPYLYLQPLIFYPGYYPDDNGIYNQKFPFKTLAMVTSFLTNICISYLAKYLFESGTL 507 

: : : I I :: I : I : : I : : : I : I : I I 
Db 460 VW ISWIKPLATINAFF GMYE IIPGFIVSVLITYIVSKL TK 499 

Qy 508 P PKLDVFDAWARH S EENMDKT I LVKNE 535 

I II: I I : : I II 

Db 500 KPD DYVI ENLNKVKHWKE 518 



RESULT 13 

US-09-328-352-6371 

; Sequence 6371, Application US/09328352 

; Patent No. 6562958 

; GENERAL INFORMATION: 

; APPLICANT: Gary L. Breton et al . 



; TITLE OF INVENTION: NUCLEIC ACID AND AMINO ACID SEQUENCES RELATING TO 
ACINETOBACTER 

TITLE OF INVENTION: BAUMANNII FOR DIAGNOSTICS AND THERAPEUTICS 
FILE REFERENCE: GTC99-03PA 

CURRENT APPLICATION NUMBER: US/09/328 , 352 
CURRENT FILING DATE: 1999-06-04 
NUMBER OF SEQ ID NOS : 8252 
SEQ ID NO 6371 
LENGTH: 501 
TYPE: PRT 

ORGANISM: Acinetobacter baumannii 
US-09-328-352-6371 

Query Match 8.7%; Score 259; DB 4; Length 501; 

Best Local Similarity 23.1%; Pred. No. 1.4e-16; 

Matches 126; Conservative 97; Mismatches 206; Indels 116; Gaps 22; 

Qy 1 MAFHVEGLI AI I VFYLLI LLVGI WAAWRTKNSGSAEERS EAI I VGGRDI GLLVGGFTMTA 60 

I : : | | : | : : : : i : | : : | I : | : || I : I I : I 

Db 6 MSYFDPTLIMFMVYIVAMVLIGLFAYRATSDFSD YILGGRSLGSFVTALSAGA 58 

Qy 61 TWGGGYINGTAEAVTVPGYGLAWAQAPIGYSLSLILGGLFFAKPMR SKGYVTML 115 

: : I : I I : I : I . I I I I I I I : I : I : 

Db 59 S DMS GWLLMGL P GAI YL S GL S EAW — I AI GLI I GAWLNWLLVAGRLRVHT E I QNNALT L P 116 

Qy 116 DPFQQI YGKRMGGLLFI PALMGEMFWAAAI FSALGATI SV IIDVDMHISVIISAL 170 

II : : I I : : : I : I I : I I : : : : : I I : 

Db 117 DYFT S RFDDQKKI LRI FS AVI I LVFF — AI YCAS GMVAGARLFENLFGMS YTTAI WLSAI 174 

Qy 171 IATLYTLVGGLYSVAYTDWQLFCIFVGLWISVPFAL S H P AVAD I G FT AVHAK Y 224 

I : I I : : : : I I I III III : : I : I 

Db 175 ATISYVCIGGFLAISWTDTFQ AGLMI FALLLTPIVTYLAIGDTTQFVTLIET 226 

Qy 225 QKPWLGTVDS SEVYSWLDS FLLLMLGGI PW-QAYF — QRVLSS S SAT YAQVLS FLAAFGC 281 

: I : I | : : : | : | | | : | : | : | | 

Db 227 ARPHAFNIIS DLSWAVLSSMAWGLGYFGQPHIL VRFMAADS- 2 68 

Qy 282 LVMAIPA 1 L I GAI GAST DWNQTAY GL P D P K TTEEADMILPIVLQY 326 

I : I I I 1:11:11 : II h :: : : : : 

Db 269 -VKSIPAARRIGMTWMILCLVGAVGAG — FFGIAYFQQHPELAGWSKNPETVFMELTKI 325 

Qy 327 LCPVYI S FFGLGAVSAAVMS SADS SI LSAS SMFARNI YQLS FRQNASDKEI WVMRITVF 386 

I : I I I : I I I I I : : I II : : I : I : I I I I I : I I I I I I 

Db 326 LFNPWIVGIILAAILAAWSTLSCQLLVCSSALTEDLYKGFIRKNASQKELVWVGRIMVL 385 

Qy 387 VFGASATAMALLTKTVYGLWYL S S DLVYI VI F PQLLCVLFVKGTNTYGAV 436 

I : I : I : I I : : : I : I :: I I I I I : 

Db 386 AI AVLAI VLAG- -N P D S KVLGLVAYAWAG FGAAFG P L 1 1 LS L FWKRMT LEGAL 436 

Qy 437 AGYVSGLFLRITGGEPYLYLQPLIF — YPGYYPDDNGIYNQKFPFKTLAMVTSFLTNICI 4 94 

II : I : I I I: :: I I: : :|:: I : 

Db 437 AGMI VGAVWI — GW KN L FAS T GVYE 1 1 P G F ICAFISIIW 475 

Qy 495 SYLAK 499 

I ::| 

Db 476 SLISK 480 



RESULT 14 

US-09-252-991A-27829 

Sequence 27829, Application US/09252991A 
Patent No. 6551795 
GENERAL INFORMATION: 
APPLICANT: Marc J. Rubenfield et al . 

TITLE OF INVENTION: NUCLEIC ACID AND AMINO ACID SEQUENCES RELATING TO 
PSEUDOMONAS 

TITLE OF INVENTION: AERUGINOSA FOR DIAGNOSTICS AND THERAPEUTICS 
FILE REFERENCE: 107196.136 

CURRENT APPLICATION NUMBER: US/ 09/252 , 991A 
CURRENT FILING DATE: 1999-02-18 
PRIOR APPLICATION NUMBER: US 60/074,788 
PRIOR FILING DATE: 1998-02-18 
PRIOR APPLICATION NUMBER: US 60/094,190 
PRIOR FILING DATE: 1998-07-27 
NUMBER OF SEQ ID NOS : 33142 
SEQ ID NO 27829 
LENGTH: 551 
TYPE: PRT 

ORGANISM: Pseudomonas aeruginosa 
US-09-252-991A-2782 9 

Query Match 8.7%; Score 257.5; DB 4; Length 551; 

Best Local Similarity 22.3%; Pred. No. 2.3e-16; 

Matches 102; Conservative 91; Mismatches 205; Indels 59; Gaps 16; 

Qy 6 EGLIAI I VFYLLI LLVGI WAAWRTKNSGSAEERSEAIIVGGR-DIGLLVGGFTM 58 

: | | : : : I I : : I : : I I | : : I : : : I I I I : I I I 

Db 2 8 KGARAMLLDYLIMLVYALAMLGLGWYGMR KAKSQSDFLVAGRRLGPGLYLG TM 80 

Qy 59 TATWVGGGYINGTAEAVYVPGYGLAWAQAPIGYSLSLILGGLFFAKPMRSKGYVTMLDPF 118 

I : I I I I : I I I : I I : I : I I : : : I : 

Db 81 AAWL GGAS T I GT VKL G YQ FG L S GLWL VFMLG — L G 1 1 VL S LVF S RQ I ARL RVFT VT QVL 138 

Qy H9 QQIY GKRMGGLLFI PALMGEMFWAAAI FSALGATI SVI IDVDMHI SVI I SALIATLY 175 

: | | : : | | : : : : : I I : I : I : : : : : 1=1 

Db 139 EQRYQAS S RLI GGWMVAY DLMVAVTATIAIGSVTEWFGIPRIPAILCGGGIVIVY 195 

Qy 176 TLVGGLYSVAYTDWQLFCI FVGLW- 1 SVP FALSHPA VAD I G FT AVHAK YQ K P 227 

: : : I I : : I : M : : I | | : : : : I : : I II 
Db 196 SVIGGMWSLTLTDIIQFVI KTVGIFLVLLPLSIDGAGGLARMQEVLPAGFFD 247 

Qy 22 8 WLGTVDS S EVYSWLDS FLLLMLGGI PWQAYFQRVLS S S S AT YAQVLS FLAAFGCLVMAI P 2 87 

||: : : : III I : I : I I I : : I I : I I : : 

Db 24 8 - LGH I GLDTI LT Y FLL YFFGALI GQDI WQRVFTARS ET WRYAGLGAGVYCMLYGAA 303 

Qy 288 AI LI GAI GASTDWNQTAYGLP DPKTTEEADMI LPI VLQYLCPVYI S FFGL- - GAVSAAVM 345 

I I I I III II 1:11 M I : I : I 

Db 304 CALI GAAARVL LPDLAVPENA— YAEITREVLAP GLRGLWAAALSAIM 350 

Qy 346 S S AD S S I L S AS SMFARN I YQL S FRQN AS DKE I VWVMRI T VFVFGAS AT AMALLT KT VYGL 405 

| : | : | : | : : : : I I I : : : I : I : : I I I 
Db 351 STASGCLLAAATVLQEDIYARFLRPGTTSD— IRLSRCITLLMGVAMLVLACLVNDVIAA 408 



Qy 



4 06 WYLSSDLVYIVIFPQLLCVLFVKGTNTYGAVAGYVSG 442 



: : : I : : : : I : : I I : I hi 

Db 409 LSI AYN L L VGGL L VP I VGAL LW RRAS P Q GAI AS I VAG 445 

RESULT 15 

US-09-489-03 9A-7541 

Sequence 7541, Application US/09489039A 
Patent No. 6610836 
GENERAL INFORMATION: 
APPLICANT: Gary Breton et . al 

TITLE OF INVENTION: NUCLEIC ACID AND AMINO ACID SEQUENCES RELATING TO 
KLEBSIELLA 

TITLE OF INVENTION: PNEUMONIAE FOR DIAGNOSTICS AND THERAPEUTICS 
FILE REFERENCE: 2709.2004001 
CURRENT APPLICATION NUMBER: US/09/489, 039A 
CURRENT FILING DATE: 2000-01-27 
PRIOR APPLICATION NUMBER: US 60/117,747 
PRIOR FILING DATE: 1999-01-29 
NUMBER OF SEQ ID NOS : 14342 
SEQ ID NO 7541 
LENGTH: 508 
TYPE: PRT 

ORGANISM: Klebsiella pneumoniae 
US-09-4 89-039A-7541 

Query Match 8.6%; Score 255; DB 4; Length 508; 

Best Local Similarity 24.8%; Pred. No. 3.6e-16; 

Matches 126; Conservative 85; Mismatches 219; Indels 78; Gaps 21; 

Qy 1 MAFHVEGLIAIIVFYLLILLVGIWAAWR-TKNSGSAEERSEAIIVGGRDIGLLVGGFTMT 59 

|| | : | | : : : I : I : I I I I I I : I : I I I : I I • 

Db 7 MAISTPMLVTFIVYIFGMVLIG-FIAWRSTKN FDDYILGGRSLGPFVTALSAG 58 

Qy 60 ATWGGGYINGTAEAVTVPGYGIAWAQAPIGYSLSLILGGLFFAKPMR SKGYVTM 114 

| : : I : I I : :: I : I II : I : I : I : : I : 

Db 59 ASDMSGWLLMGLPGAIFLSGISESW— IAIGLTLGAWINWKLVAGRLRVHTEVNNNALTL 116 

Qy H5 LDPFQQIYGKRMGGLLFIPALMGEMFWAAAIFSALGATISV IIDVDMHISVIISA 169 

II : : | | | | : : I : I : I I : = = I 

Db 117 PDYFTGRFEDKSRVLRIISALVILLFF — TIYCASGIVAGARLFESTFGMSYETALWAGA 174 

Q y 170 L I AT L YT LVGGL Y S VAYT D WQL FC I FVGLW I S VP FAL S H P AVAD I G F- T AVHAK YQ 225 

r I I III : I : : I I I I : I : : I : I M : : I 

Db 175 AAT 1 1 YT FVGGFLAVSWTDTVQAS LMI FALI LT PVIVIISVGGFGDSLEVIKQ 227 

q v 226 KPWLGTVDSSEVYSWLDSFLLLMLGGIPW QAY FQRVL S S S SAT YAQVL S F 275 

| :::: : : I : : : I I I I I I I :|: :| 

Db 228 K S I ENI DMLKGLNFVAI I SLMG- - WGLGYFGQPHI LARFMAADSHHS IVHARRI SM 281 

Qy 276 LAAFGC LVMAI P AI L I GAI GAS T DWNQT AYGL P D P K TTEEADMILPIVLQYLCPVY 331 

|| | : : | I I I : I : I : : : I I • 

Db 282 TWMILCLG GAVAVGFFG 1 AY FNN N P S LAG AVN QN AE RV FIE LAQ I L FN P W 331 

Q y 332 I S FFGLGAVS AAVMS SAD S S I L S AS SMFARN I YQL S FRQNAS DKE I VWVMRI T VFVFGAS 391 

|s I |: Mill: :| I I "h hll I 1 = 1 I I I: I I 
Db 332 IAGILLSAILAAWSTLSCQLLVCSSAITEDLYKAFLRKNAGQKELVWVGRMMVLWALV 391 



Qy 392 ATAMA LLTKTVYGLWYLSSDLVYIVIFPQLLCVLFVKGTNTYGAVAGYVSGLFL 445 

I I : I : I I : : I : I I : : : I I I : I I I I 

Db 392 AIALAANPENRVLGLVSYAWAGFGAAFGPWLF SVMWSRMTRN-GALAGMVIGALT 44 6 

Qy 446 RITGGE-PYLYLQPLIFYPGYYPDDNGI 472 

I : : I I : I I I : II 
Db 447 VI VWKQFGWLGLYEI I — PGFVFGS I GI 472 



Search completed: March 22, 2004, 15:36:40 
Job time : 39 sees 



GenCore version 5.1.6 
Copyright (c) 1993 - 2004 Compugen Ltd, 



OM protein - protein search, using sw model 
Run on: March 22, 2004, 15:19:38 



Search time 4 0 Seconds 

(without alignments) 

1394.777 Million cell updates/sec 



Title: 

Perfect score: 
Sequence : 

Scoring table: 



US-10-069-541-6 
2972 

1 MAFHVEGLI AI I VFYLLI LL . 



. EAFLDVDSSPEGSGTEDNLQ 580 



BLOSUM62 
Gapop 10.0 



Gapext 0.5 



Searched: 283366 seqs, 96191526 residues 

Total number of hits satisfying chosen parameters: 

Minimum DB seq length: 0 

Maximum DB seq length: 2000000000 



283366 



Post-processing: 



Database 



Minimum Match 0% 
Maximum Match 100% 
Listing first 45 summaries 

PIR 78:* 



1: 


pirl: * 


2: 


pir2 : * 


3: 


pir3:* 


4: 


pir4 : * 



Pred. No. is the number of results predicted by chance to have a 
score greater than or equal to the score of the result being printed, 
and is derived by analysis of the total score distribution. 
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ALIGNMENTS 



RESULT 1 
JC7502 

choline transporter - human 
C; Species: Homo sapiens (man) 

C;Date: 17-Nov-2000 #sequence_revision 17-Nov-2000 #text_change 01-Dec-2000 
C;Accession: JC7502 

R; Apparsundaram, S.; Ferguson, S.M.; George Jr., A.L.; Blakely, R.D. 
Biochem. Biophys . Res. Commun. 276, 862-867, 2000 

A; Title: Molecular cloning of a human, hemicholinium-3-sensitive choline 
transporter . 

A; Reference number: JC7502 
A; Contents: Spinal cord 
A;Accession: JC7502 
A;Molecule type: mRNA 
A; Residues: 1-580 <APP> 
A;Cross-references: GB:AF276871 

C; Comment: This protein, a hemicholinium-3-sensitive phosphorylated 
transmembrane protein, regulates high-affinity choline uptake, and plays the 
roles in disease states. 
C; Genetics : 



A; Gene : cht 

A; Map position: 2ql2 

C; Keywords: choline transport; spinal cord; transmembrane protein; transport 
protein 

Query Match 100.0%; Score 2972; DB 2; Length 580; 

Best Local Similarity 100.0%; Pred. No. 9.5e-211; 

Matches 580; Conservative 0; Mismatches 0; Indels 0; Gaps 0; 



Qy 1 MAFHVEGLIAIIVFYLLILLVGIWAAWRTKNSGSAEERSEAIIVGGRDIGLLVGGFTMTA 60 

I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I II I I I I M I I I I I I I I I I M I I I I I I 

Db 1 MAFHVEGL I AI I VF YL L I L LVG I WAAWRT KN S G S AE E RS EAI I VGGRD I G L LVGG FTMTA 60 

Qy 61 T WGGG YI NGT AEAVYVP GYGLAWAQAP I GYS L S L I LGGL FFAKPMRS KG YVTMLD P FQQ 120 

I II I I I I I I I I I M I I M I M I I I I I I II I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I 
Db 61 TWVGGGYI NGTAEAVYVPGYGLAWAQAP I GYS L S L I LGGL FFAKPMRS KG YVTMLDP FQQ 120 

Qy 121 I YGKRMGGLLFI PALMGEMFWAAAI FSALGATI SVI I DVDMHI SVI I SAL IATLYT LVGG 180 

I I I II I I I I M I II I M I M I Ml I I I M M I II I I I I I II I I I I I I I I II I I I I I I I I I 

Db 121 I YGKRMGGLLFI PALMGEMFWAAAI FSALGATI SVI I DVDMHI SVI I SAL IATLYT LVGG 180 

Qy 181 L Y S VAYT DWQL FC I FVGLW I S VP FAL S H P AVAD I GFTAVHAKYQKPWLGT VD S S EVYS W 240 

I I I I I I II I II I I I I I II I I I I I I M I I I I I II I I II I I I II I I II II M I I I I I I I I I I 

Db 181 L Y S VAYT D WQL FC I FVGLW I S VP FAL S H P AVAD I G FTAVH AK YQ K P W L GT VD S S EVY S W 240 

Qy 241 LDS FLLLMLGGI PWQAYFQRVLS S S SAT YAQVLS FLAAFGCLVMAI PAI LI GAI GASTDW 300 

I I I II I M I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 241 L D S F L L LML G G I P WQ AY FQ RVL S S S S AT YAQVL S F LAAF G C L VMAI P AI L I GAI GAS T DW 300 

Qy 301 NQTAYGLPDPKTTEEADMI LPI VLQYLCPVYI S FFGLGAVSAAVMS SADS S I LSAS SMFA 360 

I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 301 NQTAYGLPDPKTTEEADMILPIVLQYLCPVYISFFGLGAVSAAVMSSADSSILSASSMFA 360 

Qy 361 RNIYQLSFRQNASDKEIVWVMRITVFVFGASATAMALLTKTVYGLWYLSSDLVYIVIFPQ 420 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I t I I I I I I 
Db 361 RNIYQLSFRQNASDKEIW^RITVFVFGASATAMALLTKTVYGLWYLSSDLVYIVIFPQ 420 

Qy 421 LLCVLFVKGTNTYGAVAGYVSGLFLRITGGEPYLYLQPLIFYPGYYPDDNGIYNQKFPFK 480 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 421 LLCVLFVKGTNTYGAVAGYVSGLFLRITGGEPYLYLQPLIFYPGYYPDDNGIYNQKFPFK 4 80 

Qy 481 TLAMVTSFLTNICISYLAKYLFESGTLPPKLDVFDAWARHSEENMDKTILVKNENIKLD 540 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 481 T LAMVT S FLTN I C I S YLAKYL FE S GT LP P KLDVFDAWARH S EENMDKT I LVKNEN I KLD 540 

Qy 541 ELALVKPRQSMTLSSTFTNKEAFLDVDSSPEGSGTEDNLQ 580 

I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I 
Db 541 ELALVKPRQSMTLSSTFTNKEAFLDVDSSPEGSGTEDNLQ 580 



RESULT 2 
T20037 

hypothetical protein C48D1.3 - Caenorhabditis elegans 
C; Species: Caenorhabditis elegans 

C;Date: 15-Oct-1999 #sequence_revision 15-Oct-1999 #text_change 15-Oct-1999 
C;Accession: T2 0037 
R; Burton, J. 



submitted to the EMBL Data Library, October 1996 
A; Reference number: Z19214 
A;Accession; T20037 

A; Status: preliminary; translated from GB/EMBL/DDBJ 
A; Molecule type: DNA 
A; Residues: 1-631 <WIL> 

A; Cross-references: EMBL : Z81049 ; PIDN : CAB028 47 . 1; GSPDB : GN00022 ; CESP:C48D1.3 

A; Experimental source: clone C48D1 

C; Genetics : 

A; Gene: CESP.-C48D1.3 

A; Map position: 4 

A;Introns: 82/1; 120/3; 187/1; 236/3; 249/1; 358/1; 510/3; 570/2 

Query Match 45.8%; Score 1361.5; DB 2; Length 631; 

Best Local Similarity 46.8%; Pred. No. 2.5e-92; 

Matches 290; Conservative 91; Mismatches 146; Indels 93; Gaps 10; 

Qy 7 GLIAIIVFYLLILLVGIWAAWRTKNSGSAEER S EAI I VGGRD I GLLVGGFTMTATW 62 

I : : I I : I I : I I I : I I I I I : : I : I I : I : : : I I : I I I I I I I I I I I I 

Db 6 G I VAI VF F YVL I L WG I WAGRK SKSSKELES EAGAAT E EVMLAGRN I GT L VG I FTMT AT W 65 

Qy 63 VGGGYINGTAEAVYVPGYGLAWAQAPIGYSLSLILGGLFFAKPMRSKGYVTMLDP 117 

I I I I I I I I I I I : I II II I : I I : : I I :: I I I III II : I I : I I I I I 

Db 66 VGGAYINGTAEALY--NGGLLGCQAPVGYAISLVMGGLLFAKKMREEGYITMLDPFQFWN 123 

Qy 118 FQQI YGKRMGGLLFI PALMGEMFWAAAI F 146 

I I I I : I : I I I ::: I I I : I I I I I I I 
Db 124 FLELI FGRT FDNFRKLGRFLKLQTI I EI LDFFQHKYGQRI GGLMYVPALLGET FWTAAI L 183 

Qy 147 SALGATI SVI I DVDMHI SVI I SALIATLYTLVGGLYSVAYTDWQLFCI FVGLWI SVPFA 206 

I I II I I : I I I :: I I : I I : I I I I II I I I : I I II I II M I I I I I I I I : 
Db 184 SALGATLSVILGIDMNASVTLSACIAVFYTFTGGYYAVAYTDWQLFCIFVGLLILGLYV 243 

Qy 207 LSHPAVADIGFTAVHAKYQKPWLGTVDSSEVYSWLDSFLLLMLGGIPWQAYFQRVLSSSS 266 

: I I I : I I I I : I I I I I I I I I I I I I I : 

Db 244 QNRPN RFKETSLWIDCMLLLVFGGIPWQVYFQRVLSSKT 282 

Qy 267 AT YAQ VL S FLAAFG C L VMAI P AI L I GAI GAS T DWNQT A YG LPDPKTTE EA DMIL 320 

I I I I I I : I I I : : I I II I I I I I : I I I II : I I : : I : : 

Db 283 AHGAQT L S FVAGVGC I LMAI P PAL I GAI ARNT DWRMT D Y S PWNN GT KVE S I P P D KRNMW 342 

Qy 321 P I VLQYLCPVYI S FFGLGAVSAAVMS SADS S I LSAS SMFARNI YQLS FRQNAS DKEI VWV 380 

1:1 III I : : : I I I I I I I I I I I I I I I I I : I I I : I I I I I I : : I : I : I I : I I : : I 
Db 343 PLVFQYLT PRWVAFI GLGAVSAAVMSSADS S VLSAASMFAHNIWKLTI RPHAS EKEVI IV 402 

Qy 381 MRITVFVFGASAT7\MALLTKTVYGLWYLSSDLVYIVIFPQLLCVLFVKGTNTYGAVAGYV 440 

III : I II Ml : : : I I I I II : I I I I : : : I I I I I I I : : : : I I I I : : I I I 
Db 403 MRIAIICVGIMATIMALTIQSIYGLWYLCADLVWILFPQLLCVA/YMPRSNTYGSLAGYA 462 

Qy 441 SGLFLRITGGEPYLYLQPLIFYPGYYPDDNGIYNQKFPFKTLAMVTSFLTNICISYIAKY 500 

I I I I : I I I I : I III : I : I I I I : I I I : : I I : I : : 
Db 463 VGLVLRLIGGEPLVSLPAFFHYPMY TDGV— QYFPFRTTAMLSSMATIYIVSIQSEK 517 



Qy 501 LFESGTLPP KL D VFD AWARH S E ENMD KT I L VKN EN I KL D E LAL VK P RQ SMT L S S T FT N K 560 

Ihlllhll I I I I : I : I I : I I I 

Db 518 LFKSGRLS PEWDVMGCW NIPIDHVPLPSD-VSFAVSSETLNM 559 



Qy 561 EAFLDVDSSPEGSGTEDNLQ 580 

: I I : I I I I 

Db 560 KVECDGMQFPQ-LQTEHRLQ 578 



RESULT 3 
D75188 

proline symporter (proline permease) . PAB2354 - Pyrococcus abyssi (strain Orsay) 
C; Species: Pyrococcus abyssi 

C;Date: 20-Aug-1999 #sequence_revision 20-Aug-1999 #text_change 20-Jun-2000 
C; Accession: D75188 
R; anonymous , Genoscope 

submitted to the EMBL Data Library, July 1999 

A; Description: Pyrococcus abyssi genome sequence: insights into archaeal 

chromosome structure and evolution. 

A; Reference number: A75001 

A/Accession: D75188 

A; Status: preliminary 

A;Molecule type: DNA 

A; Residues: 1-492 <KAW> 

A;Cross-references: GB:AJ248283; GB:AL096836; NID: g5457433; PIDN : CAB4 8 955 . 1 ; 
PID:g5457464 

A; Experimental source: strain Orsay 
C; Genetics : 

A; Gene: putP-3; PAB2354 

C;Superfamily: proline carrier protein 

Query Match 11.6%; Score 344; DB 2; Length 492; 

Best Local Similarity 24.2%; Pred. No. l.le-17; 

Matches 132; Conservative 99; Mismatches 196; Indels 118; Gaps 25; 

Qy 8 LIAIIVFYLLILLVGIWAAWRTKNSGSAEERSEAIIVGGRDIGLLVGGFTMTATWVGGGY 67 

I : I : : I : I I I : I III: I MM: : : : : 

Db 14 LVAFL FT L I L P I LVG F YAMKRT K S EEDFFVGGRAMDKI TVALS AVS S GRS SWL 66 

Qy 68 I N GTAEAVYVP G Y GLAWAQAP I G Y S L S LILGGLFFAKPMRSKGYVTMLDPFQQIYG 123 

: I : II I | : | I : : : I : I : | : I I : : 

Db 67 VL G L S GMA YKMG VTAVW — AAVGYIVAEMFQFVYMGIRLRKFSERFNAITVPDYFEARFR 124 

Qy 124 K RMGG LL FI PALMGEMFWAAAI FS ALGAT I S VI I DVDMH I SVI I S ALI ATL 174 

I : :: | : : : | Ml I : I : : : :: : I I I : : 

Db 125 DT S KI LRI AAS 1 1 1 1 1 FLT S YVGAQFNAGA KTLSTALGI S I FTALMI SVLMI IV 178 

Qy 175 YT LVGGL YS VAYT DWQL FC I FVGLWI S VP FAL S H PAVAD I GFT AVHAKYQK 226 

I :: I I : I I I I I :: : : I I : I I I I : I I I : I 

Db 179 YMILGGFIAVAYNDVIRAVIMIIGLW LPVIAVAKVGGTEEVLKVLHALDPKLIN 233 

Qy 227 PW L GT VD S S E VY SWLDSFLL LML G - G I P WQ AY - FQ RVL S S S SAT YAQ VL S F LAAFG C 281 

II II I : I I I I : 1:1 : I : : I 

Db 234 PWAFGAGWIG FLGIGFGSPGQPHIIVRYMSIDDPNKLRVSTWGTFWN 282 

Qy 282 L VMAI PAI L I GAI GAS TDWNQTAYGL PD PKTT - - EEADMI LP- 1 VLQ YLC P VYI S FFGLG 338 

: I : I I I : I I : : I I : I : I I I : I I I : : I 

Db 283 WLAWGAI FVGLAGRAI VPDVSQLPGKNAEMIYPYLSAQYFPPILYGIL-IG 333 



Qy 



339 AVS AAVMS SAD S S I L S AS SMFARN I YQ L S FRQN A — S DK E I VWVMRI T VFVFGAS AT AMA 396 
: I I : : I : I I I : I : I : : : I I : : : I : : I : || I I I : I 



Db 



334 GI FAAI LSTADSQLLVVASTVVKDLYQEVI KKGTKI DEKTALTI SRVTVLVVGFLAAI LA 393 



Qy 397 LLTKTVYGLWYLS S DLVY- 1 VI F PQLLCVLFVKGTNTYGAVAGYVSGLFL 44 5 

I : : I : : : I : I I I : I : I I I : I : I I : I 

Db 394 YVAKDI I FWFVLFAWGGLGAS FGPTLI LS LYWKGTTKWGVLAGMI VGTIT 443 

Qy 446 RITGGEPYLYLQPLIFYPGYYPDDNGIYNQKFPFKTLAMVTSFLTNICISYLAKYLFESG 505 

I I I I : I : I : I : I I : I : I : I : I 
Db 444 TIVW KLYLKPI TGLY-ELVP AFIFSLIATIIVSMITK 479 

Qy 506 TLPPK 510 

I I : 

Db 480 — PPE 482 



RESULT 4 
A37226 

glucose transport protein - rabbit 

N;Alternate names: sodium/D-glucose cotransporter 

C; Species: Oryctolagus cuniculus (domestic rabbit) 

C;Date: 30-Dec-1991 #sequence_revision Ol-Mar-1996 #text_change 20-Aug-1999 
C;Accession: S00515; S15974; A37226 

R;Hediger, M.A. ; Coady, M.J.; Ikeda, T.S.; Wright, E.M. 
Nature 330, 379-381, 1987 

A; Title: Expression cloning and cDNA sequencing of the Na/glucose co- 
transporter . 

A; Reference number: S00515; MUID : 88065856; PMID:2446136 
A;Accession: S00515 
A; Molecule type: mRNA 
A; Residues: 1-662 <HED> 

A/Cross-references: EMBL:X06419; NID:gl640; PIDN: CAA29727 . 1 ; PID:gl641 
R;Morrison, A.I.; Panayotova-Heiermann, M. ; Feigl, G. ; Schoelermann, B.; Kinne, 
R.K.H. 

Biochim. Biophys . Acta 1089, 121-123, 1991 

A;Title: Sequence comparison of the sodium-D-glucose cotransport systems in 
rabbit renal and intestinal epithelia. 

A;Reference number: S15974; MUID : 91223090 ; PMID:2025641 
A;Accession: S15974 
A; Molecule type: mRNA 
A; Residues: 1-662 <MOR> 

A;Cross-references: EMBL:X55355; NID:gl716; PIDN : CAA3904 0 . 1 ; PID:gl717 
R;Coady, M.J.; Pajor, A.M.; Wright, E.M. 
Am. J. Physiol. 259, C605-C610, 1990 

A; Title: Sequence homologies among intestinal and renal Na ( + ) /glucose 
cotransporters . 

A; Reference number: A37226; MUID : 91023017 ; PMID:2221040 

A;Accession: A37226 

A; Status: preliminary 

A; Molecule type: mRNA 

A; Residues: 178-662 <COA> 

A; Cross-references : GB:X06419 

A; Experimental source: renal cortex 

C; Superf amily : proline carrier protein 

Query Match 10.4%; Score 308.5; DB 2; Length 662; 

Best Local Similarity 23.4%; Pred. No. 6.6e-15; 

Matches 154; Conservative 110; Mismatches 238; Indels 155; Gaps 26; 



Qy 11 IIVFYLLILLVGIWAAWRTKNSGSAEERSEAIIVGGRDIGLLVGGFTMTATWVGGGYING 70 

I : : : : I : : : I I : I I : I I I : : I I : I : : I : : I I : I 

Db 32 I VI Y FLWMAVGLWAM F S T - N RGT V GGFFLAGRSMVWWPIGASLFASNIGSGHFVG 86 

Qy 71 TAEAVYVPGYGLAWAQAP I GYS LSLILGGLFFAKPMRSKGYVTMLDPFQQIY-GK 124 

I III I I : : : : I | : I : I : I I I I : I : : I I 

Db 87 LA GT GAAS GI AT GGFEWNAL I MVWLGWVFVP I Y I RA- GWTMP E YLQKRFGGK 139 

Qy 125 RMGGLLFIPALMGEMFW — AAAI FSALGAT-ISVIIDVDMHISVI ISALIATLYTLVGGL 181 

I : I I : I : : I Mill II I : : : I : : : : : I I : I I I I : I I I 
Db 140 RIQIYLSILSLLLYIFTKI SADIFS — GAIFIQLTLGLDIYVAIIILLVITGLYTITGGL 197 

Qy 182 Y S VAYT DWQ L FC I FVGLW I S VP FAL S H P AVAD I G FT AVHAK Y Q 225 

: I I I I : I : I I I I I hill I 

Db 198 AAVI YT DT LQT AI MMVGS VI LT GFAFH EVG GYEAFTEKYMRAI PSQI S YGNTSI PQ 253 

Qy 226 KPWLGTVDSSEVYSWLDSFLLLMLGGIPW QAYFQRVLS S S SA 267 

I : I : : I : I I I I I I I I I : : 

Db 254 KCYTPREDAFHI FRDAITGDIPWPGLVFGMSILTLWYWCTDQVIVQRCLSAKNL 307 

Qy 268 T YAQVL S FLAAFGCLVMAI PAI L I GAI GAS T DWNQTAYGL P D P KTTEEADMILP 321 

• • • I • • • « » . | • • • i • I • • I 

• •• i • • • • • • i • ■ • | • | ■ > i 

Db 308 SHVKAGCILCGYLKVMPMFLIVMMGMVSRILYTDKVACWPSECERYCGTRVGCTNIAFP 367 

Qy 322 I VLQ YLCP VYI S FFGLGAVSAAVMS SADS S I LS AS SMFARNI YQL S FRQNAS DKEI VWVM 381 

: : II : I : h : I I I I I I h : I : I I I : I h I h : 
Db 368 TLVVELMPNGLRGLMLSVMI^SLMSSLTSIFNSASTLFTMDIY-TKIRKKASEKELMIAG 42 6 

Qy 382 RI-TVFVFGASATAMALLTKTVYG — LWYLSSDLVYI — VI FPQLLCVLFVKGTNTYGAV 436 

I : : I : I I : : : I I : I I : I I : I I I I I 

Db 427 RLFMLFLIGISIAWVPIVQSAQSGQLFDYIQSITSYLGPPIAAVFLLAIFWKRVNEPGAF 486 

Qy 437 AGYVSGLFLRI TG GEPYLYLQPLI FYPGYYPDDNGI Y 473 

III : I II | | | | :: I 
Db 487 WGLVLGFLIGISRMITEFAYGTGSCMEPSNCPTIICGVHYLYFAIILF 534 

Qy 474 NQ KFP FKT LAMVT S FLTNI C I S YLAK YL FE S GT L P P KLD VFDAWA- RH S E ENMDKT I L V 532 

I I : I : : I I : I : : : : I : h I 
Db 535 VISIITWWSLFTKPI PDVHLYRLCWSLRNSKE 568 

Qy 533 KNENIKLD--ELALVKPRQSMTLSSTFTNKEAF LDVDSSPEGSGTED 577 

I I III : : : I : hi I I I h : h 

Db 569 --ERIDLDAGEEDIQEAPEEATDTEVPKKKKGFFRRAYDLFCGLDQDKGPKMTKEEE 623 



RESULT 5 
A33545 

Na+/glucose cotransporter SGLT1 - human 
C; Species: Homo sapiens (man) 

C;Date: 27-Feb-1990 #sequence_revision 27-Feb-1990 #text_change 20-Aug-1999 

C;Accession: A33545; A53804 

R;Hediger, M.A.; Turk, E . ; Wright, E.M. 

Proc. Natl. Acad. Sci. U.S.A. 86, 5748-5752, 1989 

A;Title: Homology of the human intestinal Na (+) /glucose and Escherichia coli 
Na (+) /proline cotransporters . 

A; Reference number: A33545; MUID: 89345544 ; PMID:2490366 



A;Accession: A33545 
A; Molecule type: mRNA 
A; Residues: 1-664 <HED> 

A; Cross-references: GB:M24847; NID:g338054; PIDN : AAA60320 . 1 ; PID:g338055 
R;Turk, E. ; Martin, M.G.; Wright, E.M. 
J. Biol. Chem. 269, 15204-15209, 1994 

A;Title: Structure of the human Na+/glucose cotransporter gene SGLT1. 

A; Reference number: A53804; MUID: 94253082 ; PMID: 8195156 

A; Accession: A53804 

A; Status : preliminary 

A; Molecule type: DNA 

A; Residues: 1-45 <TUR> 

A;Note: sequence extracted from NCBI backbone (NCBIN : 147993, NCBIP : 147994 ) 
C; Genetics : 

A; Gene: GDB:SLC5A1; SGLTl 

A;Cross-references: GDB: 120375; OMIM: 182380 

A; Map position: 22ql3 . l-22ql3 . 1 

C; Superfamily: proline carrier protein 

C; Keywords: transmembrane protein; transport protein 

Query Match 10.3%; Score 306; DB 2; Length 664; 

Best Local Similarity 22.8%; Pred. No. le-14; 

Matches 148; Conservative 104; Mismatches -218; Indels 178; Gaps 30; 

Qy 11 HVFYLLILLVGIWAAWRTKNSGSAEERSEAIIVGGRDIGLLVGGFTMTATWVGGGYING 70 

|:::::::: I hi I : I I |: : II : I :: I: :| I: I 

Db 32 IVIYFVWMAVGLWAMFST-NRGTV GGFFLAGRSMVWWPIGASLFASNIGSGHFVG 86 

Qy 71 TAEAVYVP G YGLAWAQAP I GYS LSLILGGLFFAKPMRSK-GYVTMLDPFQQI YGK 124 

| III ||: I : : I I I I I : I I I I I : I 

D b 87 LA GT GAAS G I AI GG FEWNAL VLVWL GWL FV — P I Y I KAGWTM PEYLRK 134 

Qy 125 RMGG LLFI PALMGEMFWAAAI FSALGATI SVI I DVDMHI SVI I SALI A 172 

Ml 11:1 : = : I I I | :::::::::: : I 

Db 135 RFGGQRIQVYLSLLSLLLYIFTKISADIFSGAIF INLALGLNLYLAI FLLLAIT 188 

Qy 173 T L YT LVGGL YS VAYT D WQ LFC I FVGLWI S VP FAL S H P AVAD I G FT AVHAK YQ K — PWL - 229 

I I I : I I I : I I I I : I : I I I II hi I I I I : 

Db 189 AL YT I TGGLAAVI YTDTLQTVIMLVGS LI LTGFAFHEVG GYDAFMEKYMKAI PTIV 244 

Qy 230 GTVDS SEVYS -WLDS FLLL MLGGIPW QAYFQRVLSS 264 

I : | : | | | : : I : I I I I I I h 

Db 245 SDGNTTFQEKCYTPRADSFHIFRDPLTGDLPWPGFIFGMSILTLWYWCTDQVIVQRCLSA 304 

Qy 265 SSATYAQ VLS FLAAFGCLVMAI PAI L IGAI GASTDWNQT 303 

: : : : : : I : | : | : : I : I 

Db 305 KNMSHVKGGCILCGYLKLMPMFIMVMPGMISRILYTEKIACWPSECEKYCGTKVGCTNI 364 

Qy 304 AYGLPDPKTTEEADMI LP I VLQ YLCPVYI S FFGLGAVSAAVMS SADS S I LSAS SMFARN I 363 

M | :: | | : I : I : : I I I I |||::| :| 

Db 365 AY PT L WELMPN GLRGLML S VMLAS LMS S LT S I FN S AS T L FTMD I 4 09 

Qy 364 YQLS FRQNASDKEIVWVMRITVFV- FGASATAMALLTKTVYG — LWYLS SDLVYI VI F 418 

I |: | |: | |:: h : I I I : : = I hi h I 

Db 410 Y-AKVRKRASEKELMIAGRLFILVLIGISIAWVPIVQSAQSGQLFDYIQSITSYLGPPIA 468 



Qy 



419 PQLLCVLFVKGTNTYGAVAGYVSGLFLRI 



TG 



GEPYLY 4 55 



I : I I I I I I : I I : I II I I I I 

Db 4 69 AVFLLAIFWKRVNEPGAFWGLILGLLIGISRMITEFAYGTGSCMEPSNCPTIICGVHYLY 528 

Qy 456 LQPLIFYPGYYPDDNGIYNQKFPFKTLAMVTSFLTNICISYLAKYLFESGTLPPKLDVFD 515 

: : I I I : I : I I I I : I : : : 

Db 529 FAIILF AISFITIWISLLTKPI PDVHLYR 558 

Qy 516 AV — VARH S E ENMD KT I LVKN EN I KLDELAL VK P RQ SMT L S S T FTN KE 561 

: I I : I : : II I : | : I : 

Db 559 LCWSLRNSKEERID — LDAEEENIQ EGPKETIEIETQVPEKK 598 



RESULT 6 
A53582 

Na+/glucose cotransporter SGLT1 - rat 

C; Species: Rattus norvegicus (Norway rat) 

C;Date: 12-Apr-1995 #sequence_revision 12-Apr-1995 #text_change 20-Aug-1999 
C;Accession: A53582 

R;Lee, W.S.; Kanai , ' Y. ; Wells, R.G.; Hediger, M.A. 
J. Biol. Chem. 269, 12032-12039, 1994 

A;Title: The high affinity Na (+) /glucose cotransporter. Re-evaluation of 
function and distribution of expression. 

A;Reference number: A53582; MUID : 94216314 ; PMID: 8163506 
A;Accession: A53582 
A; Status: preliminary 
A; Molecule type: mRNA 
A; Residues: 1-665 <LEE> 

A; Cross-references: GB:U03120; NID:g414571; PIDN : AAA19015 . 1 ; PID:g414572 
C; Super family : proline carrier protein 

Query Match 10.3%; Score 306/ DB 2; Length 665; 

Best Local Similarity 23.5%; Pred. No. le-14; 

Matches 155; Conservative 105; Mismatches 242; Indels 158; Gaps 29; 

Qy 11 IIVFYLLILLVGIWAAWRTKNSGSAEERSEAIIVGGRDIGLLVGGFTMTATWVGGGYING 70 

I :::::::: I I : I I : I I I : : I I : I : : I : : I I : I 

Db 32 IVIYFWVMAVGLWAMFST-NRGTV GGFFLAGRSMVWWPIGASLFASNIGSGHFVG 86 

Qy 71 TAEAVYVPG YGLAWAQAP I G YS L S LILGGLFFAKPMRSK-GYVTMLDPFQQIYGK 124 

I III II:: : : I I I I I : I I I I I : I 

Db 87 LA GT GAAAG I AMGG F EWN ALVFVWL GW L FV- - P I Y I KAGWTM PEYLRK 134 

Qy 125 RMGG LLFI PALMGEMFWAAAI FSALGATI SVI I DVDMHI SVI I SAL I A 172 

III I I : I : : : I I I I : : : : I : : : : : I I 

Db 135 RFGGKRIQI YLS VLSLLLYI FTKI SADI FSGAI F INLALGLDIYLAIFILLAIT 188 

Qy 173 TLYTLVGGLYSVAYTDVVQLFCIFVGLWISVPFTUjSHPAVADIGFTAVHAKYQK — PWL- 229 

III: Ml : I II I : I : I I : I M I - I Ml I I 

Db 189 ALYTITGGLAAVIYTDTLQTAIMLVGSFILTGFAFREVG GYEAFMDKYMKAI PTLV 244 

Qy 230 — GTVD- S S EVYS -WLDS FLLL MLGGIPW QAYFQRVLSS 264 

I : II: III: : I : I I I MM: 

Db 245 SDGNITVKEECYTPRADSFHIFRDPITGDMPWPGLIFGLSILALWYWCTDQVIVQRCLSA 304 

Qy 2 65 S SAT YAQVLS FLAAFGCLVMAI PAI LI GAI GASTDWNQTAYGLPDP KTTEEADM 318 

: : : : I : I : : : I I : : I II : : 

Db 305 KNMSHVKAGCTLCGYLKLLPMFLMVMPGMISRILYTDKIACVLPSECKKYCGTPVGCTNI 364 



Qy 319 I LPIVLQYLCPVYI S FFGLGAVSAAVMS SADS S ILSAS SMFARNI YQLS FRQNASDKEI V 378 

I : : I I : I : I : : II I I I I I : : I : I I I : I I : I I : : 

Db 365 AYPTLWELMPNGLRGLMLSVMMASLMSSLTSIFNSASTLFTMDIY-TKIRKGASEKELM 423 

Qy 379 WVMRITVFV- FGASATAMALLTKTVYG — LWYLSSDLVYI — VTFPQLLCVLFVKGTNTY 433 

I : : I I I : : : I I : I I : I I : I I I 

Db 424 IAGRLFILVLIGISIAWVPIVQSAQSGQLFDYIQSITSYLGPPIAAVFLLAI FCKRVNEP 4 83 

Qy 434 GAVAGYVSGLFLRI TG GEPYLYLQPLI FYPGYYPDDN 470 

I I I : I : I II I I I I : : I 
Db 484 GAFWGLILGFLIGISRMITEFAYGTGSCMEPSNCPKIICGVHYLYFAIILF 534 

Qy 471 GIYNQKFPFKTLAMVTSFLTNICISYLAKYLFESGTLPPKLDVFDAV — VARHSEENMDK 528 

I : I : I I I I : I : : : I I : I 

Db 535 AISWTVLVISLLTKPI PDVHLYRLCWSLRNSTEERID- 572 

Qy 529 TILVKNENIKLDELALVKPRQSMTLSSTFTNKE AFLDVDSSPEGSGTED 577 

I I ::| |: :: : : || III h : |: 

Db 573 — LDAGEEEPVEE DPKDTIEIDAEAPQKEKGCFRKAYDLFCGLDQDKGPKMTKEEE 626 



RESULT 7 
A44432 

amino acid transport protein - pig 

N; Alternate names: Na+/amino acid cotransporter, SAAT1 
C; Species: Sus scrofa domestica (domestic pig) 

C;Date: 31-Dec-1993 #sequence_revision 31-Dec-1993 #text_change 20-Aug-1999 
C;Accession: A44432 

R;Kong, C.T.; Yet, S.F.; Lever, J.E. 
J. Biol. Chem. 268, 1509-1512, 1993 

A; Title: Cloning and expression of a mammalian Na+/amino acid cotransporter with 

sequence similarity to Na+/glucose cotransporters . 

A; Reference number: A44432; MUID : 93131881 ; PMID: 8420925 

A; Accession: A44432 

A; Molecule type: nucleic acid 

A; Residues: 1-660 <KON> 

A/Cross-references: GB:L02900; NID:gl64666; PIDN: AAC37325 . 1 ; PID:gl64667 

A; Experimental source: kidney epithelial cell line LLC-PK1 

A;Note: sequence extracted from NCBI backbone (NCBIP : 122778 ) 

C; Superf amily : proline carrier protein 

C; Keywords: amino acid transport; membrane protein 

Query Match 10.2%; Score 303.5; DB 2; Length 660; 

Best Local Similarity 23.2%; Pred. No. 1.5e-14; 

Matches 141; Conservative 103; Mismatches 230; Indels 135; Gaps 26; 

Qy 11 IIVFYLLILLVGIWAAWRTKNSGSAEERSEAIIVGGRDIGLLVGGFTMTATWVGGGYING 70 

I :::::::: I I : I I MM: : I I I : I : : I : : I I : I 

Db 32 IVIYFVWMAVGLWAMLRT-NRGTV GGFFLAGRDVTWWPMGASLFASNIGSGHFVG 86 

Qy 71 TAEAVYVP G YG LA WAQAPIGYSLSLILGGLFFAKPMRSKGYVTMLDPFQQIY-GKRM 126 

I I : I I I I : : I I I I : I I : :: : I I I : 

Db 87 LAGTGAAS GI AI AAFEW NALLLLLVLGWFFVP I YI KAGVMTMPEYLRKRFGGKRL 141 

Qy 127 GGLLFIPAL MGEMFWAAAI FSALGATI SVI I DVDMHI SVI I SALIATLYTLVG 179 

I I : I : : : I I I I : : : I : : : : : I : I I : I 



Db 142 QIYLSILSLFICVALRISSDIFSGAIF IKIALGLDLYLAIFSLLAITAIYTITG 195 

Qy 180 GLYSVAYTDWQLFCI FVGLWI SVPFALSHPAVADI GFTAVHAKYQK — PWLGTVD 233 

I I I I I II : I : : | : | : | | | | : : II | : | 

Db 196 GLAS VI YTDTLQTI IMLI GS FI LMGFAF VEVGGYES FTEKYMNAI PTI VEGDNLTI 251 

Qy 234 SSEVYS-WLDSFLLL MLGGIPW QAYFQRVLS S S S AT YAQ 271 

I : I : Ml: : I I I I I I I I I : : : 

Db 252 SPKCYTPQGDSFHIFRDAVTGDIPWPGMI FGMTWAAWYWCTDQVIVQRCLSGKDMSHVK 311 

Qy 272 VLS FLAAFGCLVMAI P AI L I GAI GAS T DWNQTAYGL P D P KT TEE — ADMILPIVLQ 325 

: : I : : : I I : I : I II : : I : : : 

Db 312 AACIMCGYLKLLPMFLMVMPGMISRILYTEKVACWPSECVKHCGTEVGCSNYAYPLLVM 371 

Qy 326 YLCPVYI S FFGLGAVSAAVMS SADS S ILSAS SMFARN I YQLS FRQNASDKEI VWVMRI TV 385 

II : I : I : : I I I I I I I : : I : : I I : I I : I I : : I : : 

Db 372 ELMPSGLRGLMLSVMLASLMSSLTSIFNSASTLFTMDLY-TKIRKQASEKELLIAGRLFI 430 

Qy 386 FVFGASATAMALLTKTVYG LWYLS SDLVYI - -VI FPQLLCVLFVKGTNT YGA V 436 

: : I : I : I I : I I : I I I I I : 

Db 431 ILLIVISIVWVPLVQVAQNGQLFHYIESISSYLGPPIAAVFLLAI FCKRVNEQGAFWGLI 490 

Qy 437 AGYVSGL FLRITG GEPYLYLQPLI FYPGYYPDDNGI YNQKF 477 

hi II I : I I I I I I : : I : 
Db 491 IGFVMGLI RMI AEFVYGTGSCLAASNCPQI I CGVHYLYFALI LFF 535 

Qy 478 PFKTLAMVTSFLTNICISYLAK YLFE SGTLPPKLDVFDAWARH 521 

I I : I I I I : I : : :: I : I I II 

Db 536 VSILWLAISLLTKPIPDVHLYRLCWALRNSTEERIDL-DAEEKRHEEAHDG 586 

Qy 522 -SEENMDKT 529 

I : I : : I 

Db 587 VDEDNPEET 595 



RESULT 8 
E83468 

probable sodium/ solute symporter PA1418 [imported] - Pseudomonas aeruginosa 
(strain PAOl) 

C; Species: Pseudomonas aeruginosa 

C;Date: 15-Sep-2000 #sequence_revision 15-Sep-2000 #text_change 31-Dec-2000 
C;Accession: E83468 

R; Stover, C.K.; Pham, X.Q.; Erwin, A.L.; Mizoguchi, S.D.; Warrener, P.; Hickey, 
M.J.; Brinkman, F.S.L.; Hufnagle, W.O.; Kowalik, D.J.; Lagrou, M. ; Garber, R.L. 
Goltry, L.; Tolentino, E.; Westbrook-Wadman, S.; Yuan, Y. ; Brody, L.L.; Coulter 
S.N,; Folger, K.R.; Kas, A.; Larbig, K.; Lim, R.M. ; Smith, K.A. / Spencer, D.H.; 
Wong, G.K.S.; Wu, Z.; Paulsen, I.T.; Reizer, J.; Saier, M.H.; Hancock, R.E.W.; 
Lory, S.; Olson, M.V. 
Nature 406, 959-964, 2000 

A;Title: Complete genome sequence of Pseudomonas aeruginosa PA01, an 
opportunistic pathogen. 

A; Reference number: A82950; MUID : 20437337 ; PMID: 10984043 
A; Accession: E8 3468 
A; Status: preliminary 
A; Molecule type: DNA 
A; Residues: 1-463 <STO> 



A; Cross-references : GB:AE004571; GB:AE004091; NID : g9947360 ; PIDN : AAG04807 . 1 ; 

GSPDB:GN00131; PASP:PA1418 

A; Experimental source: strain PA01 

C; Genetics : 

A; Gene: PA1418 



Query Match 10.1%; Score 301; DB 2; Length 463; 

Best Local Similarity 25.1%; Pred. No. 1.5e-14; 

Matches 115; Conservative 86; Mismatches 211; Indels 46; Gaps 15; 



Qy 


9 


IAI I VFYLLI LLVGI WAAWRTKNSGSAEERSEAIIVGGRDIGLLVGGF TMTAT 

: 1 : : 1 : 1 1 1 : 1 1 1 : 1 : : 1 1 1 : : 1 II 1 1 1 1 
MALDI FWLI YAAGMIALGWYGMR RAKTRDD-YLVAGRNLG PGFYLGTMAAT 


61 


Db 


1 


51 


Qy 


62 


WVGGG YINGTAEAVYVPG YGLAWAQAP I GYS LS L I LGGL FFAKPMRS KG YVTMLD P FQQI 
: 1 1 II III 1 III:: Mill: 1 : : : 


121 


Db 


52 


VLGGASTIGTVRLGYVHGISGFWLCGAIG — LGIVGLSLFLAKPLLKLKIYTVTQVLERR 


109 


Qy 


122 


YGKRMGGLLFI PALMGEMFWAAAI FSALGATI SVI IDVDMHI S VI I SALIATLYTLVGGL 
1 : 1 : : 1 1 : 1 : 1 : : : 1 : : 1 : 1 1 : : 1 1 : 
YN PAARHAS AL IMLVYALMI GAT ST IAI GTVMQVL FGLP FWVS I L I GGGVWLYST I GGM 


181 


Db 


110 


169 


Qy 


182 


YSVAYTDWQLFCI FVGL-WI SVPFALSHPAVADI GFTAVHAKYQKPWLGTVDS SEVYSW 

.i. ii.ii .iii, .1.., i.i. ii i.. i 
: 1 : II : 1 1 : | M : : : | : : : | : | : | | : | : : | 

WSLTLTDIVQFLIMTVGLVFLLMPLSINDAG GWDAL VAKL PAS Y F DFTAI-GW 


240 


Db 


170 


221 


Qy 


241 


LDS FLLLMLGGI PWQAYFQRVLS S S SAT YAQVLS FLAAFGCLVMAI PAI LI GAI GAS 

: II: 1 1 : 1 1 1 : : 1 1 1 : 1 1 1 : : : Ml 
DT I VT YFLIYFFGIFIGQDI WQ RVFT AR S ET VAKVAG S AAG I Y C VL YGMAGAL I GMAAKV 


297 


Db 


222 


281 


Qy 


298 


TDWNQTAYGLPD P KTTEEADMI LPI VLQYLCPVYI S FFGLGAVSAAVMS SADS S I LSAS S 
III 1 : 1 ::: 1 1 : 1 1 1 : 1 1 : 1 : : 1 : 1 1 : 
L LPD LENWNAFASVVEHSLPNGI RGLVIAAALAALMSTASAGLLAAST 


357 


Db 


282 


330 


Qy 


358 


MFARNIY-QLSFRQNASDKEIWVMRITVFVFGASA^ 

: : : : 1 : 1 1 1 II : 1 MM 1 : : :| : 
TVTQDLLPRLRRGRGQSDNGDVHENRIATLLLGLWLGIALVVSDVI SALTVAYNLLVGG 


416 


Db 


331 


390 


Qy 


417 


I FPQLLCVLFVKGTNTYGAVA GYVSGLFLRITGG 450 




Db 


391 


: 1 : :: 1 MM 1 ::: 1 II 
MLIPLIGAIYWKRATTAGAITSMTLGFLTVLVFMIKDG 428 





RESULT 9 
B83988 

proline transporter opuE [imported] - Bacillus halodurans (strain C-125) 
C; Species: Bacillus halodurans 

C;Date: 01-Dec-2000 #sequence_revision 01-Dec-2000 #text_change 15-Jun-2001 
C; Accession: B83988 

R;Takami, H. ; Nakasone, K. ; Takaki, Y. ; Maeno, G. ; Sasaki, R. ; Masui, N . ; Fuji, 
F. ; Hirama, C. ; Nakamura, Y. ; Ogasawara, N.; Kuhara, S.; Horikoshi, K. 
Nucleic Acids Res. 28, 4317-4331, 2000 

A;Title: Complete genome sequence of the alkaliphilic bacterium Bacillus 
halodurans and genomic sequence comparison with Bacillus subtilis. 
A; Reference number: A83650; MUID : 20512582 ; PMID : 11058132 
A; Accession: B83988 
A; Status : preliminary 



A; Molecule type: DNA 
A; Residues : 1-507 <STO> 

A;Cross-references: GB:AP001516; GB:BA000004; NID: gl0175192 ; PIDN : BAB06425 . 1 ; 
GSPDB:GN00137 

A; Experimental source: strain C-125 
C; Genetics : 
A; Gene: opuE 

C;Superfamily: proline carrier protein 

Query Match 10.1%; Score 299.5; DB 2; Length 507; 

Best Local Similarity 26.2%; Pred. No. 2.2e-14; 

Matches 141; Conservative 84; Mismatches 220; Indels 93; Gaps 27; 

Qy 5 VEGL - 1 AI I VFYLL - 1 LLVGI WAAWRTKNS GSAEERS EAI I VGGRDI GLLVGGFTMTATW 62 

| I I : I I : : I I : : I I : I : : : : • \ ' ' I I : : : : : 

Db 4 VEPLAVAILIAYLVALLLIGLLSS-KKSSVGMTD F F I AG RN LN KWT VAL S AVS S G 57 

Qy 63 VGGGYINGTAEAVYVPGYGLAWAQAPIGYSLSLILGGLFFAKPMRSKGY VTMLD 116 

: | II I I I I I : I M I : I : > : I 

Db 58 RS AWLVLG VTGTAYATGLDAVWAVA — GYI TVEVF — LFFYVARRFRAYS EQTGS I T I PD 113 

Q y H7 PFQQIYGKR MGGLLFI PALMGEMFWAAAI FSAL GATISVIIDVDMHISVIISA 169 

: : : II II : I I I : I I I : : I • : : 1 

Db H4 ILETRFNDKTHIL RGG S AF I — I M — F FMI AYVAS Q LVAG GGAFAT SMGVS S S T GMWVT A 169 

Qy 170 L I AT L YT LVGGL YS VAYT D WQ LFC I FVG LWI S VP FAL S H P AVAD I G FTAVHA 222 

: I | | : : I I : : I : I I I I I : I I I I I I I I I : I 

D b 17 0 VI LLAYTMLGGFHAVS KTDWQAGFMFVS LVI L PWAI I GLGGFDPLLQVMHT 222 

Q y 223 KYQKPWLGTVTDSSEVYSWLDSFLLLMLG-GIPWQAY-FQRVLSSSSATYAQVLSFLAAFG 280 

: | | : : I I : I I I : I : I : : : : : : 

Db 223 EG GGFT S P FAFGFGAVI GLLGI GFGS PGN PHI LVRYMS LKNVKEMRQAALI S S VW 277 

Qy 2 81 CLVMAI PAI LI GAI GASTDWNQTAYGLPDPKTTEEAD MI LPI VLQYLCPVYI S FFGL 337 

::| |:: | | I II II I : I : : I |:: I I 

Db 278 NVLMGWGAVMI GLAG RAY-FPDVSLLPNGDQEQVFLMLGSEILHPLFFGFL-L 328 

Qy 338 GAVSAAVMS SADS S I LSAS SMFARN I YQLS FRQN — AS DKEIVWVMRITVFVFGASATAM 395 

| | | | : | | | | | I : I Mlhlll I I : I I I : : I : I : I II : : 

Db 329 VAVLAAIMSSADSQLLVGSSAFVRDIYQKMFRRNRKLSQKKLVRLSRLTTWFMGLSLIL 388 

Qy 396 ALLTKTVYGLWYLSSDLVYIVIF PQLLCVLFVKGTNTYGAVAGYVSGLFL 445 

|| : | :|:| III : II | : I ::|| 

Db 38 9 A- FTAQEFVFW MVLFAFGGLGACFGPALLLSFYWKGVTRQGVLWGMIAGLLT 439 

Q y 446 RITGGEPYLYLQPLIFYPGYYPDDNGIYNQKFPFKTLAMVTSFLT NICISYLAK 499 

| : I I 1:11 : I I I I I : -II I 

Db 440 VI LVKQQPQWTY-AFLPDVKELLNTYFFGITYEAVPGFIVATTITWISLFTK 491 

RESULT 10 
A42251 

nucleoside transport protein - rabbit 

N; Alternate names: Na+/nucleoside cotransporter , SNSTl 
C; Species: Oryctolagus cuniculus {domestic rabbit) 

C;Date: 31-Dec-1993 #sequence_revision 31-Dec-1993 #text_change 20-Aug-1999 
C;Accession: A42251 



R;Pajor, A.M.; Wright, E.M. 

J. Biol. Chem. 267, 3557-3560, 1992 

A; Title: Cloning and functional expression of a mammalian Na+/nucleoside 

cotransporter . A member of the SGLT family. 

A; Reference number: A42251; MUID; 92156077 ; PMID: 1740408 

A/Accession: A42251 

A;Molecule type: mRNA 

A; Residues: 1-672 <PAJ> 

A;Cross-references: GB:M84020; NID:gl65550; PIDN : AAA31421 . 1 ; PID:gl65551 
A;Note: sequence extracted from NCBI backbone (NCBIN : 82253, NCBIP: 82256) 
C; Superf amily : proline carrier protein 
C;Keywords: membrane protein; nucleoside transport 

Query Match 10.0%; Score 298; DB 2; Length 672; 

Best Local Similarity 25.0%; Pred. No. 4e-14; 

Matches 153; Conservative 89; Mismatches 232; Indels 138; Gaps 25; 

Qy 9 IAII-VFYLLILLVGIWAAWRTKNSGS7VEERSEAIIVGGRDIGLLVGGFTMTATWVGGGY 67 

I I : I : : I I : : I I : I : I I I I : : I I : I : : I : : I I : 

Db 2 6 I AVI AAYFLLVI GVGLWSMCRT-NRGTV GGYFLAGRSMVWWPVGASLFASNIGSGH 8 0 

Qy 68 INGTAEAVYVPG YGLAWAQAP I G YS L S LILGGLFFAKPMRSKGYVTMLDPFQQIYG 123 

II III II:: : : | | | | : I : I I I 
Db 81 FVGLA GT GAAN GLAVAG FEWNAL FWLL L GWL FAP VYLT AGVI TM PQYLR 130 

Qy 124 KRMGG LLFI PALMGEMFWAAAI F- - SALGATI SVI I DVDMHI SVI I SA 169 

MM I : I : : : I : I Mil : M I 

Db 131 KRFGGHRIRLYLSVLSLFLYIFTKISVDMFSGAVFIQQALGWNI YASVIALL 182 

Qy 170 L I AT L YT LVGGL YS VAYTDWQLFC I FVGLWI S VP FAL SH PAVAD I G FT AVHAKY 224 

I :||: III III II I I I : I : I I : : : II 

Db 183 GITMVYTVTGGLAALMYTDTVQTFVIIAGAFILTGYAFHEVG GYSGLFDKYMGAMT 238 

Qy 225 QKPWLGTVDSSEVYSWLDSFLLL MLGGIPW QAYF 258 

: I : I : I I I I : I I : I : I I I 

Db 239 SLTVSEDPAVGNISSSCYRPRPDSYHLLRDPVTGDLPWPALLLGLTIVSGWYWCSDQVIV 298 

Qy 259 QRVLS S SSATYAQVLS FLAAFGCLVMAI PAI LI GAI GASTDWNQTAYGLPDPKT TE 314 

II I: : |: : I : I :: I I :: I |: II 

Db 299 QRCLAGRNLTHIKAGCILCGYLKLTPMFLMVMPGMISRILYPDEVACVAPEVCKRVCGTE 358 

Qy 315 E- - ADMI L P I VLQ YLC P VY I S FFGLGAVS AAVMS SAD S S I L S AS SMFARN I YQ L S FRQNA 372 

:: : I :: I i : I : I I : I I I I M I : : I Ml I I I 
Db 359 VGCSNIAYPRLWKLMPNGLRGLMLAVMLAALMSSLASIFNSSSTLFTMDIYTL — RPRA 416 

Qy 373 SDKEIVWVMRITVFVFGASATAMALLTKTVYG LWYLSSDLVYIV — IFPQLLCVLFV 427 

: I : : I I : I I : I : : I I : I I : : : I I I 

Db 417 G E G E L L L VG RLWWF I VAVS VAW L P WQAAQGGQ LFDYIQSVSS YLAP P VS AVFWAL FV 476 

Qy 428 KGTNTYGAVAGYVSGLFLRITGGEPYLYLQPLIFYPGYYPDDNGIYNQKFPFKTLAMV — 485 

I I I I : I I : : I M I I : I 

Db 477 PRVNEKGAFWGLIGGLLMGLARLIP EFSFGTGSCVRP 513 

Qy 486 TSFLTNICISYLAKYLFE-SG T L P - P KL DVFDAWA- RH S EENMDKT I 530 

Ml: I I I I I I III:: I : I I I : I 
Db 514 SACPAFLCRVHYLYFAIVLFFCSGLLIIIVSLCTAPIPRKHLHRLVFSLRHSKE 567 



Qy 531 LVKNENIKLDEL 542 

: I : : I I I 
Db 568 — EREDLDADEL 577 



RESULT 11 
S59637 

glucose transport protein SGLT1, intestinal - sheep 
N;Alternate names: Na+/glucose cotransporter SGLT1 

C;Species: Ovis orientalis aries, Ovis ammon aries (domestic sheep) 

C;Date: 10-Apr-1996 #sequence_revision 19-Apr-1996 #text_change 20-Aug-1999 

C;Accession: S59637; S48858 

R;Tarpey, P.S.; Wood, I.S.; Shirazi-Beechey, S.P.; Beechey, R.B. 
Biochem. J. 312, 293-300, 1995 

A;Title: Amino acid sequence and the cellular location of the Na (+) -dependent D- 
glucose symporters (SGLT1) in the ovine enterocyte and the parotid acinar cell. 
A;Reference number: S59637; MUID : 96077158 ; PMID: 7492327 
A;Accession: S59637 

A; Status: nucleic acid sequence not shown 
A; Molecule type: mRNA 
A; Residues: 1-664 <TAR> 

A;Cross-references: EMBL:X82411; NID:g861072; PIDN: CAA57809 . 1; PID:g861073 
A; Experimental source: tissue type jejunal mucosa^ 
R;Wood, I. 

submitted to the EMBL Data Library, October 1994 
A; Reference number: S48858 
A;Accession: S48858 
A; Molecule type: mRNA 

A;Residues: 1-233, ' R' , 235-432, "V , 434-466, 'MR' , 469-664 <WOO> 

A;Cross-references : EMBL:X82411 

C; Superf amily : proline carrier protein 

Query Match 9.9%; Score 294; DB 2; Length 664; 

Best Local Similarity 23.9%; Pred. No. 7.7e-14; 

Matches 127; Conservative 93; Mismatches 202; Indels 110; Gaps 23; 

Qy 11 IIVFYLLILLVGIWAAWRTKNSGSAEERSEAIIVGGRDIGLLVGGFTMTATWVGGGY 67 

I :::::::: I I : I I : I I I : : I I : I : : I : : I I : 

Db 32 I VI YFVVVMAVGLWAMFST - NRGTV GG F FLAG RSMVWW P I GAS L FAS N I G S GH FVG 8 6 

Qy 68 INGTAEAVWPGYGLAWAQAPIGYSLSLILGGLFFAKPMRSK-GYVTMLDPFQQIYGKRM 12 6 

: I I I : II I I I : I I : Mill : II 

Db 87 LAGTGAAAGIATGGFEWN AL I L WLL GWVFV — P I Y I KAGWTM PEYLRKRF 136 

Qy 127 GG LLFI PALMGEMFWAAAI FSALGATI S VI I DVDMHI S VI I SALIATL 174 

II : I : I : : : I I I I : : : : I : : : : : I II 

Db 137 GGQRIQVYLSVLSLVLYI FTKI SADI FSGAI F INLALGLDLYLAIFILLAITAL 190 

Qy 175 YT L VG G L Y S VAYT DWQ L F C I FVG L W I S VP FAL S H P AV7\D I G FT AVHAK YQ K P W L GT VD S 234 

I I : I I I : I I I I : I : : I : I II I : : I II : I I I 

Db 191 YTITGGLAAVIYTDTLQTVIMLLGSFILTGFAFHEVG GYSAFVTKYMNA-IPTVTS 245 

Qy 2 35 SEVYS-WLDSFLLL MLGGIPW QAYFQRVLSSS 2 65 

I I : III: : I : I I I MM: 

Db 246 YGNTTVKKECYTPRADSFHIFRDPLKGDLPWPGLIFGLTIISLWYWCTDQVIVQRCLSAK 305 



Qy 



266 SATYAQVLSFLAAFGCLVMAI PAILIGAIGASTDWNQTAYGLPDPKTTEE AD 317 



306 NMSHVKAGCIMCGYMKLLPMFLMVMPGMISRILFTEKVACTV--PSECEKYCGTKVGCTN 363 



Qy 318 MI L P I VLQ YLC PVY I S FFGLGAVS AAVMS S ADS S I L S AS SMFARN I YQL S FRQN AS DKE I 377 

: I : : II : I : I : : I I I I I I I : : I : I I I : I I : I I : 
Db 364 IAYPTLWELMPNGLRGLMLSVMLASLMSSLTSIFNSASTLFTMDIY-TKIRKKASEKEL 422 

Qy 378 VWVMRI TVFV- FGASATAMALLTKTVYG — LWYLS S DLVYI — VI FPQLLCVLFVKGTNT 432 

: I : : I I I : : : I I : I I : I I : I I I 

Db 423 MIAGRLEKLVLIGVSIAWVPIVQSAQSGQLFDYIQSITSYLGPPIAAVFLLAI FCKRVNE 482 

Qy 433 YGAVAGYVSGLFLRI TG GEPYLYLQPLIF 4 61 



483 PGAFWGLIIGFLIGVSRMITEFAYGTGSCMEPSNCPTIICGVHYLYFAIILF 534 



RESULT 12 
A56765 

sodium-glucose cotransporter homolog - human 
C; Species: Homo sapiens (man) 

C;Date: 08-Sep-1995 #sequence_revision 08-Sep-1995 #text_change 20-Aug-1999 
C;Accession: A56765; 151890 

R;Wells f - R.G.; Pajor, A.M.; Kanai, Y. ; Turk, E.; Wright, E.M. ; Hediger, M. A. 
Am. J. Physiol. 263, F459-F465, 1992 

A; Title: Cloning of a human kidney cDNA with similarity to the sodium- glucose 
cotransporter . 

A; Reference number: A56765; MUID : 93035768 ; PMID: 1415574 
A; Access ion: A56765 
A; Status : preliminary 
A; Molecule type: mRNA 
A; Residues: 1-672 <WEL> 

A; Cross-references: GB:M95549; NID:g338052; PIDN : AAA36608 . 1 ; PID:g338053 
A; Experimental source: kidney cortex 
C; Superf amily : proline carrier protein 
C; Keywords: transmembrane protein 

Query Match 9.8%; Score 292; DB 2; Length 672; 

Best Local Similarity 24.1%; Pred. No. l.le-13; 

Matches 147; Conservative 91; Mismatches 237; Indels 136; Gaps 22; 

Qy 8 LIAIIVFYLLILLVGIWAAWRTKNSGS7VEERSEAIIVGGRDIGLLVGGFTMTATWVGGGY 67 

: : I : : | I : : I | : I : III!: : I I : I : : I : : I I : 

Db 26 ILVIAAYFLLVIGVGLWSMCRT-NRGTV GGYFLAGRSMVWWPVGASLFASNIGSGH 80 

Qy 68 I NGTAEAVYVP G YGLAWAQAP I G YS L S L I LGGL FFAKPMRS KG YVTMLD P FQQ I YG 123 

II Mill:: : : I I I I : I : I I I 
Db 81 FVGLA GTGAASGLAVAGFEWNALFWLLLGWLFAPVYLTAGVITM PQYLR 130 

Qy 124 KRMGG LLFI PALMGEMFWAAAI F — SALGAT I SVI I DVDMHI SVI I SA 169 

II II I : I : : : I : I I I I I MM 
Db 131 KRFGGRRIRLYLSVLSLFLYIFTKISVDMFSGAVFIQQALGWNI YASVIALL 182 

Qy 170 L I AT L YT LVGGL Y S VAYT DWQL FC I FVGLWI S VP FAL S H PAVAD I GFTAVHAKY 224 

I :||: III III II I I I I : : I I : : : II 

Db 183 GITMIYTVTGGLAALMYTDTVQTFVILGGACILMGYAFHEVG GYSGLFDKYLGAAT 238 



Qy 



225 



QKPWLGTVDS S EVYSWLDS FLLL MLGGIPW- 



QAYF 258 



: I : I : I I I : I I : I : I I I 

Db 239 SLTVSEDPAVGNISSFCYRPRPDSYHLLRHPVTGDLPWPALLLGLTIVSGWYWCSDQVIV 298 

Qy 259 QRVLS S S SAT YAQVL S FLAAFGCLVMAI PAI LIGAI GASTDWNQTAYGLPDPKT TE 314 

111:11:: i : I :: I I : : | : | : II 

Db 299 QRCLAGKSLTHIKAGCILCGYLKLTPMFLMVMPGMISRILYPDEVACWPEVCRRVCGTE 358 

Qy 315 E — ADMI LP I VLQ YLCPVYI S FFGLGAVSAAVMS SADS S I LS AS SMFARN I YQLS FRQNA 372 

: : : I : : I I : I : I I : I I I I I : I : : I : I I II 

Db 359 VGCSNIAYPRLVVKLMPNGLRGLMIAVMLAALMSSLASIFNSSSTLFTMDIY-TRLRPRA 417 

Qy 373 S DKE I VWVMRI - TVFVFGASATAMALLTKTVYGLW YLS S DLVYI VI FPQLLCV LFV 427 

I : I : : I I : I I : I : : : I : I : I : I III 

Db 418 GDRELLLVGRLWWFIVWSVAWLPWQAAQGGQLFDYIQAVSSYLAPPVSAVFVLALFV 477 

Qy 428 KGTNTYGAVAGYVSGLFLRITGGEPYLYLQPLIFYPGYYPDDNGIYNQKFPFKTLAMV-- 485 

I I I I : I I : : I : I I : : I 

Db 478 PRVNEQGAFWGLIGGLLMGLARLIP EFSFGSGSCVQP 514 

Qy 486 TSFLTNICISYLAKYLFE-SGTLPPKLDVFDAW ARHSEENMDKTI 530 

: I I : I I II II I : : I : I I I : I 

Db 515 SACPAFLCGVHYLYFAIVLFFCSGLLTLTVSLCTAPIPRKHLHRLVFSLRHSKE 568 

Qy 531 LVKNENIKLDE 541 

: I : : II 
Db 569 — EREDLDADE 577 



RESULT 13 
S59638 

glucose transport protein SGLT1 , parotid gland - sheep 
N;Alternate names: Na+/glucose cotransporter SGLT1 

C; Species: Ovis orientalis aries, Ovis aitimon aries (domestic sheep) 

C;Date: 19-Mar-1997 #sequence_revision 19-Mar-1997 #text_change 07-May-1999 

C;Accession: S59638; S48857 

R;Tarpey, P.S.; Wood, I.S.; Shirazi-Beechey, S.P.; Beechey, R.B. 
Biochem. J. 312, 293-300, 1995 

A; Title: Amino acid sequence and the cellular location of the Na (+) -dependent D- 
glucose symporters (SGLT1) in the ovine enterocyte and the parotid acinar cell. 
A; Reference number: S59637; MUID : 96077158 ; PMID:7492327 
A;Accession: S59638 

A; Status: nucleic acid sequence not shown; translation not shown 
A;Molecule type: mRNA 
A; Residues: 1-664 <TAR> 
A/Cross-references : EMBL:X82410 

A; Experimental source: clone SGLTB; tissue type parotid gland 

A;Note: the nucleotide sequence was submitted to the EMBL Data Library, October 
1994 

C; Superf amily : proline carrier protein 
C; Keywords: transmembrane protein 

Query Match 9.7%; Score 288; DB 2; Length 664; 

Best Local Similarity 23.7%; Pred. No. 2.1e-13; 

Matches 126; Conservative 93; Mismatches 203; Indels 110; Gaps 23; 

Qy 11 IIVFYLLILLVGIWAAWRTKNSGSAEERSEAIIVGGRDIGLLVGGFTMTATWVGGGY 67 

| :::::::: I I : I : I I I : : I I : | : : I : : I I : 



Db 



32 I VI Y FVWMAVG LWHMF S T - N RGT V GGFFLAGRSMVWWP I GAS LFASNI GS GH FVG 86 



Qy 68 INGTAEAVYVPGYGLAWAQAPIGYSLSLILGGLFFAKPMRSK-GYVTMLDPFQQIYGKRM 126 
: | | I : || | : : | | : | I : I I I I I : II 

87 LAGTGAAAG I AT GG FEWN AL I LWLLGWVFV — P I YI KAGWTM PEYLRKRF 136 



Db 

Qy 

Db 



Qy 



127 GG LLFI PALMGEMFWAAAI FSALGATI SVI I DVDMHI SVI I SALI ATL 174 

|| :|:| : :: III |:: : :|::::: I II 

137 GGQRIQVYLSVLSLVLYI FTKI SAD I FSGAI F INLALGLDLYLAIFILLAITAL 190 



Qy 175 YT L VG G L Y S VAYT DWQ L FC I FVG LW I S VP FAL S H P AVAD I G FT AVHAK YQ K P W L GT VD S 234 

I I : I I I : I I I I : I : : I : I I I I : = I I I • N I 

Db 191 YTITGGLAAVIYTDTLQTVIMLLGSFILTGFAFHEVG GYSAFVTKYMNA- 1 PTVTS 245 



235 SEVYS-WLDSFLLL MLGGIPW QAYFQRVLSSS 265 

||: Ml : : I : I I I 11.11s 

Db 246 YGNTT VKKEC YT P RAD S FH I FRD P LKGDL PW P GL I FGLT 1 1 S LWYWCT DQVI VQRC L S AK 305 



Qy 266 SAT YAQVL S F LAAFGC LVMAI PAI L I GAI GAS T DWNQTAYGL P D P KT T E E AD 317 

: : : : : : I : : : I I : I : I I : : 

Db 306 NMSHVKAGCIMCGYMKLLPMFLMVMPGMISRILFTEKVACTV— PSECEKYCGTKVGCTN 363 

Q y 318 MI LP I VLQYLC PVYI S FFGLGAVS AAVMS S ADS S I L S AS SMFARNI YQLS FRQNAS DKE I 377 

: | : : | | : I : I : : I I I I I I I s : I : I I I : I I : I I = 

Db 364 IAYPTLWELMPNGLRGLMLSVMLASLMSSLTSIFNSASTLFTMDIY-TKIRKKASEKEL 422 

Qy 378 VWVMRITVFV-FGASATAMALLTKTVYG — LWYLSSDLVYI — VIFPQLLCVLFVKGTNT 432 

: |: : I I I : :: I hi" I : I I :| I I 

Db 423 MIAGRLFMLVLIGVSIAWVPIVQSAQSGQLFDYIQSITSYLGPPIAAVFLLAI FCKRVNE 482 

Qy 433 YGAVAGYVSGLFLRI TG GEPYLYLQPLI F 461 

II I : I : : II I I I I : : I 

Db 483 PGAFWGLIIGFLIGVSRMITEFAYGTGSCMEPSNCPTIICGVHYLYFAIILF 534 



RESULT 14 
H71097 

hypothetical protein PH1044 - Pyrococcus horikoshii 
C; Species: Pyrococcus horikoshii 

C;Date: 14-Aug-1998 #sequence_revision 14-Aug~1998 #text_change 20-Jun~2000 
C;Accession: H71097 

R;Kawarabayasi, Y.; Sawada, M. ; Horikawa, H-; Haikawa, Y.; Hino, Y. ; Yamamoto, 
S.; Sekine, M.; Baba, S.; Kosugi, H. ; Hosoyama, A.; Nagai, Y. ; Sakai, M. ; Ogura, 
K.; Otsuka, R. ; Nakazawa, H.; Takamiya, M. ; Ohfuku, Y. ; Funahashi, T.; Tanaka, 
T . ; Kudoh, Y. ; Yamazaki, J.; Kushida, N.; Oguchi, A.; Aoki, K. ; Yoshizawa, T.; 
Nakamura, Y. ; Robb, F.T.; Horikoshi, K. ; Masuchi, Y. ; Shizuya, H.; Kikuchi, H. 
DNA Res. 5, 55-76, 1998 

A; Title: Complete sequence and gene organization of the genome of a hyper- 
thermophilic archaebacterium, Pyrococcus horikoshii OT3 . 
A; Reference number: A71000; MUID : 98344137 ; PMID: 9679194 
A;Accession: H71097 

A; Status: preliminary; nucleic acid sequence not shown; translation not shown 
A; Molecule type: DNA 
A; Residues: 1-491 <KAW> 

A; Cross-references: GB:AP000004; NID: g3236131 ; PIDN : BAA30142 . 1 ; PID:g3257459 
A; Experimental source: strain OT3 



A; Note: this accession replaces an interim accession for a sequence replaced by 

GenBank 

C; Genetics : 

A; Gene: PH1044 

C; Super family: proline carrier protein 

Query Match 9.6%; Score 286; DB 2; Length 491; 

Best Local Similarity 22.9%; Pred. No. 2.1e-13; 

Matches 125; Conservative 83; Mismatches 197; Indels 142; Gaps 20; 

Qy 8 LIAIIVFYLLILLVGIWAAWRTKNSGSAEERSEAIIVGGRDIGLLVGGFTMTATWVGGGY 67 

II :: : I |:||: I : I : I :| II :||: I 

Db 17 LIIVLGWVFLSLIVGVMAGIKRKFT LEGYLVSGRTLGLI FLYVLMAGEI YSAYA 70 

Qy 68 I N GT AEAVYVP G Y GLAWAQ AP I G Y SLSLILGGLF FA KPMRSKGYVTMLDPFQQIYG 123 

II I I : : I III I I : I I :: I I I I I I I 

Db 71 FLGTGGWAYSYGMPIMYA 1 GYGALAYS FGYFYARYVWKAGKAFGC VTQADYFQVRYN 127 

Qy 124 KRMGGLLFI PALMGEMF WAAAI FSALGAT I SV — II DVDMHI S VI I SALIATLYTLV 178 

: I : I I : I : I : I I : I : : : : I : I : I 

Db 128 SK — ALAVLVALI GI I FNI P YLQLQLQGLGYI VHVGSLGS ITPKAGI VI GMI IMMI YVYT 185 

Qy 179 GGLYSVAYTDWQLFCIFVGLWISVPFALSHPAVADIGFTAVHAKYQKPWLGTVDSSEVY 238 

II : : : I : : : I : I : I : I I : II II 
Db 186 SGLRGISWTNLLQATLMFIVAWV-VLFTIPFKQFGGIGELFKTLAQTKP 233 

Qy 239 SWLDS FLLLMLGGI PWQAYFQRVLS S S SAT YAQVLS FLAAFGCLV- -MAI PAI LI GAI GA 296 

I : I II I I I I : I : hi 
Db 234 DHLILHPPLGISW YVSTL-ILSGLGFFMYPQLYPSI 268 

Qy 297 STDWNQTAYGLPDPKTTEEADMILPIVLQYLCPVYISFFGLGAVS 341 

I I III: : : I I : . : : I I : : I : I : 
Db 269 YGARDLKTLKRNYVLLPLYS I FMI PVI LAGFTVAALGI KLSAPDEAVLKAVE 320 

Qy 342 AAVMS S ADS S I L S AS SMFARN I YQLS FRQNAS DKE I VWVMRI TV 385 

II I : I : : I I : : : : I : I : : : I I II I : I I I I : I 
Db 321 I T YP SWVLG WGAAGFAAAASTASAI LLS LAGLLS KNLYAI A- KPTAS DKELVLVS RI S V 379 

Qy 386 FVFGAS ATAMALLT KTVYGLWYLS S DLVYI VI FPQLLCVLFVKGTNTY 433 

: I I : I I I II::: I I : I I I I I 

Db 380 ILLGLLAMGLAL YS P GRLVS L L L LAYAGMT QMFP GAVFG L FWKRMN K YAT G 430 

Qy 434 ~ GAVAG YVS GLFLRI TGGEP YLYLQPLI F YP GY YPDDNGI YNQKFP FKT LAMVT S FLTNI 492 

I : I I : : : I I : I II I I : I : : : : : 

Db 431 TGIIAGLITVAYLRLV LKKNPL GIH FGLWGLLVNI IVTL 469 

Qy 493 CISYLAK 499 

:: I I I 

Db 470 IVAYLTK 476 



RESULT 15 
H69670 

sodium/proline symporter opuE - Bacillus subtilis 
N; Alternate names: proline transporter opuE 
C; Species: Bacillus subtilis 

C;Date: 05-Dec-1997 #sequence_revision 05-Dec-1997 #text_change 20-Jun-2000 



C;Accession: H69670; T44450 

R;Kunst, F- ; Ogasawara, N . ; Moszer, I.; Albertini, A.M.; Alloni, G. ; Azevedo, 
V.; Bertero, M.G.; Bessieres, P.; Bolotin, A.; Borchert, S.; Boriss, R. ; 
Boursier, L.; Brans, A.; Braun, M. ; Brignell, S.C.; Bron, S.; Brouillet, S.; 
Bruschi, C.V. ; Caldwell, B.; Capuano, V.; Carter, N.M.; Choi, S.K.; Codani, 
J. J.; Connerton, I.F.; Cummings , N.J.; Daniel, R.A. ; Denizot, F. ; Devine, K.M.; 
Duesterhoeft, A.; Ehrlich, S.D.; Emmerson, P.T.; Entian, K.D.; Errington, J.; 
Fabret, C; Ferrari, E. 
Nature 390, 249-256, 1997 

A;Authors: Foulger, D . ; Fritz, C; Fujita, M. ; Fujita, Y. ; Fuma, S.; Galizzi, 

A. ; Galleron, N . ; Ghim, S.Y.; Glaser, P.; Goffeau, A.; Golightly, E.J.; Grandi, 
G.; Guiseppi, G. ; Guy, B.J.; Haga, K.; Haiech, J.; Harwood, C.R.; Henaut, A.; 
Hilbert, H. ; Holsappel, S . ; Hosono, S.; Hullo, M.F.; Itaya, M. ; Jones, L . ; 
Joris, B.; Karamata, D.; Kasahara, Y. ; Klaerr-Blanchard, M. ; Klein, C. ; 
Kobayashi, Y. ; Koetter, P.; Koningstein, G. ; Krogh, S.; Kumano, M. ; Kurita, K. ; 
Lapidus, A.; Lardinois, S. 

A;Authors: Lauber, J.; Lazarevic, V.; Lee, S.M.; Levine, A.; Liu, H.; Masuda, 
S.; Maueel, C. ; Medigue, C; Medina, N . ; Mellado, R.P.; Mizuno, M. ; Moestl, D. ; 
Nakai, S.; Noback, M. ; Noone, D.; O'Reilly, M. ; Ogawa, K.; Ogiwara, A.; Oudega, 

B. ; Park, S.H.; Parro, V.; Pohl, T.M.; Portetelle, D. ; Porwolik, S.; Prescott, 
A.M.; Presecan, E . ; Pujic, P.; Purnelle, B. ; Rapoport, G. ; Rey, M. ; Reynolds, 
S.; Rieger, M. ; Rivolta, C; Rocha, E . ; Roche, B-; Rose, M. ; Sadaie, Y. ; Sato, 
T . ; Scanlon, E. 

A;Authors: Schleich, S.; Schroeter, R. ; Scoffone, F. ; Sekiguchi, J.; Sekowska, 
A.; Seror, S.J.; Serror, P.; Shin, B.S.; Soldo, B. ; Sorokin, A.; Tacconi, E. ; 
Takagi, T.; Takahashi, H.; Takemaru, K. ; Takeuchi, M. ; Tamakoshi, A.; Tanaka, 
T . ; Terpstra, P.; Tognoni, A.; Tosato, V.; Uchiyama, S.; Vandenbol, M. ; Vannier, 
F. ; Vassarotti, A.; Viari, A.; Wambutt, R. ; Wedler, E. ; Wedler, H.; 
Weitzenegger, T.; Winters, P.; Wipat, A.; Yamamoto, H.; Yamane, K. ; Yasumoto, 
K.; Yata, K. ; Yoshida, K. 

A;Authors: Yoshikawa, H.F.; Zumstein, E . ; Yoshikawa, H. ; Danchin, A. 

A; Title: The complete genome sequence of the Gram-positive bacterium Bacillus 

subtilis . 

A; Reference number: A69580; MUID : 98044033 ; PMID: 9384377 
A;Accession: H69670 

A; Status: preliminary; nucleic acid sequence not shown; translation not shown 
A;Molecule type: DNA 
A; Residues: 1-492 <KUN> 

A;Cross-references: GB:Z99107; GB:AL009126; NID : g2 632866; PIDN : CAB12486 . 1 ; 
PID:g2632980 

A; Experimental source: strain 168 
R;Borriss, R. 

submitted to the EMBL Data Library, June 1997 

A; Reference number: Z22776 

A;Accession: T44450 

A; Status : preliminary 

A;Molecule type: DNA 

A; Residues: 1-492 <B0R> 

A;Cross-references: EMBL: AF011545; PIDN: AAB72182 . 1 
C; Genetics : 
A; Gene : opuE 

A; Map position: 56 degree 
C; Function : 

A; Description: catalyzes the uptake of proline by a Na+-dependent transport 
mechanism 

C;Superfamily: proline carrier protein 



C; Keywords: proline transport; sodium transport; symport system; transmembrane 



protein 

F; 44-71/Domain: 
F; 126-145/Domain 
F; 161-183/Domain 
F; 189-208/Domain 
F;231-253/Domain 
F;272-295/Domain 
F; 311-347/Domain 
F; 366-386/Domain 
F;391-417/Domain 
F;422-438/Domain 
F;452-470/Domain 



transmembrane #status predicted <TM1> 

: transmembrane #status predicted <TM2> 

: transmembrane tfstatus predicted <TM3> 

: transmembrane ftstatus predicted <TM4> 

: transmembrane istatus predicted <TM5> 

: transmembrane #status predicted <TM6> 

: transmembrane #status predicted <TM7> 

: transmembrane #status predicted <TM8> 

: transmembrane #status predicted <TM9> 

: transmembrane #status predicted <TM10> 

: transmembrane #status predicted <TM11> 



Query Match 9.6%; Score 285; DB 2; Length 492; 

Best Local Similarity 22.1%; Pred. No. 2.5e-13; 

Matches 118; Conservative 97; Mismatches 214; Indels 106; 



Gaps 18; 



QY 
Db 

QY 
Db 

QY 
Db 

QY 
Db 

QY 
Db 

QY 
Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 



5 VEGLIAIIVFYLLILLVGIWAAWRTKNSGSAEERSEAIIVGGRDIGLLVGGFTMTATWVG 64 

: | : | ::::::: I I : I : I : I : : : I I I : I I : I : 

3 I EI 1 1 SLGI YFIAMLLI GWYAFKKTTDIND YMLGGRGLGP FVTALS AGAADMS 55 

65 GGYINGTAEAVYVPGYGLAWAQAPI .GYSLSLILGGLFFAKPMRSKGYVTMLDPFQQI 121 

I : I I : : I I : II hi I : : I : I I : 

56 GWMLMGVP GAMFAT GL ST LWLAL G LT I GAY SN YLL LAP RL RAYT EAAD DAI TIPDFFDKR 115 

122 YGKRMGGLLFI PALMGEMFWAAAI FSAL GAT I S VI I DVDMH I S VI I SAL I AT L YT LV 178 

: | : | | : : | : I : I | : : : : : | | | | 

116 FQHSSSLLKIVSALIIMIFFTLYTSSGMVSGGRLFESAFGADYKLGLFLTTAVWLYTLF 175 

179 GGL Y S VAYT D WQL FC I FVGLW I S VP FAL S H P AVAD I G FT AVHAK YQK PWLGT VD S S EVY 238 

I I : I : I I I I : I I : I I : I I I I : I : • 

176 G G FLAVS LT D FVQGAI M FAAL - VLVP I VAFT — H VGGVAP T FH E I D AVN P H 223 

239 SWLDSF L L LML GG I P WQ AY FQ RVL S S S SAT YAQ VLSFLA 277 

III :::::| : II : : I : I 

224 - LLDI FKGASVI S 1 1 S YLAWGLGY YGQPHIIVRFMAIKDIKDLKPARRIG 272 

278 AFGCLVMAI PAI LI GAI GASTDWNQTAYGLPDPKTTEEADMI LPI VLQYLCPVYI S FFGL 337 

: : : : : I I I I II : : : | I I : I I : I I 

273 MSWMI ITVLGSVLTGLIG VAYAHKFGVAVKDPEMIFIIFSKILFHPLITGFLL 325 

338 GAVS AAVMS SAD S S I L S AS SMFARN I YQL S FRQN AS DKE I VWVMRI TVFVFGAS ATAMAL 397 

I : I I : I I I I : I : I : : I : I I : I I I I I : I : I : : I I I : : I 

326 SAI LAAIMS S I S SQLLVTAS AVT EDLYRS FFRRKAS DKELVMI GRLSVLVI AVI AVLLS L 385 

398 LTKTVYGLWYLSSDLVYIVIF PQL LCVL FVKGTNT YGAVAG YVS G LF 444 

: I :: : I : I : I I : I I : I I : I : I : 

386 NP NSTILDLVGYAWAGFGSAFGPAILLSLYWKRMNEWGALAAMIVGAATVL 436 

445 LRITGGEPYLYLQPLIFYPGYYPDDNGIYNQKFPFKTLAMVTSFLTNICISYLAK 499 

: || | I : I : I : | : | : I : I 

437 IWITTG LAKSTGVY-EIIP GFILSMIAGIIVSMITK 471 



Search completed: March 22, 2004, 15:35:51 
Job time : 44 sees 



GenCore version 5,1.6 
Copyright (c) 1993 - 2004 Compugen Ltd. 



OM protein - protein search, using sw model 

Run on: March 22, 2004, 15:32:29 ; Search time 5,31 Seconds 

(without alignments) 
282.851 Million cell updates/sec 



US-10-069-541-6 . 
2972 

1 MAFHVEGLIAIIVFYLLILL EAFLDVDSSPEGSGTEDNLQ 580 

BLOSUM62 

Gapop 10.0 , Gapext 0.5 

1049977 seqs, 258955339 residues 

Total number of hits satisfying chosen parameters: 1049977 

Minimum DB seq length: 0 

Maximum DB seq length: 2000000000 

Post-processing: Minimum Match 0% 

Maximum Match 100% 
Listing first 45 summaries 



Title: 

Perfect score: 
Sequence : 

Scoring table: 
Searched: 



Database : Published_Applications_AA: * 

1: /cgn2^6/ptodata/l/pubpaa/US07_PUBCOMB.pep:* 

2 : / cgn2_6/ptodata/ l/pubpaa/PCT_NEW_PUB . pep : * 

3: /cgn2_6/ptodata/l/pubpaa/US06_NEW_PUB.pep:* 

4: /cgn2_6/ptodata/l/pubpaa/US06_PUBCOMB.pep: * 

5 : / cgn2_6/ptodata/ l/pubpaa/US07_NEW_PUB . pep : * 

6: /cgn2_6/ptodata/l/pubpaa/PCTUS_PUBCOMB.pep:* 

7: /cgn2_6/ptodata/l/pubpaa/US08_NEW_PUB.pep:* 

8: /cgn2_6/ptodata/l/pubpaa/US08_PUBCOMB.pep:* 

9: /cgn2_6/ptodata/l/pubpaa/US09A_PUBCOMB.pep:* 

10: /cgn2_6/ptodata/l/pubpaa/US09B_PUBCOMB.pep:* 

11 : /cgn2_6/ptodata/l/pubpaa/US09C_PUBCOMB.pep : * 

12 : /cgn2_6/ptodata/l/pubpaa/US09_NEW_PUB.pep: + 

13: / cgn2_6/ptoda ta/ 1 /pubpaa/US 1 0A_PUBCOMB . pep : * 

14: /cgn2_6/ptodata/l/pubpaa/US10B_PUBCOMB.pep:* 

15 : /cgn2_6/ptodata/l/pubpaa/US10CJ?UBCOMB.pep:* 

16: / cgn2_6/ptodata/ 1/pubpaa/US 10_NEW_PUB . pep : * 

17: /cgn2_6/ptodata/l/pubpaa/US60_NEW_PUB.pep: + 

18: /cgn2_6/ptodata/l/pubpaa/US60_PUBCOMB.pep: * 

Pred. No. is the number of results predicted by chance to have a 
score greater than or equal to the score of the result being printed, 
and is derived by analysis of the total score distribution. 



SUMMARIES 

% 

Result Query 

No. Score Match Length DB ID Description 



1 


2972 


100. 


0 


580 


10 


US-O9-911-077A-2 


Sequence 2, Appli 


2 


2972 


100. 


0 


580 


10 


US-09-911-077A-10 


Sequence 10, Appl 


3 


2972 


100. 


0 


580 


10 


US-09-911-077A-11 


Sequence 11, Appl 


4 


2972 


100. 


0 


580 


10 


US-09-911-077A-12 


Sequence 12, Appl 


5 


2820 


94. 


9 


580 


10 


US-09-911-077A-6 


Sequence 6, Appli 


6 


2795 


94. 


0 


580 


10 


US-09-911-077A-4 


Sequence 4, Appli 


7 


2795 


94. 


0 


580 


10 


US-09-911-077A-24 


Sequence 24, Appl 


8 


1506.5 


50. 


7 


610 


12 


US-10-241-784-2 


Sequence 2, Appli 


9 


1453 


48. 


9 


576 


10 


US-09-911-077A-8 


Sequence 8, Appli 


10 


311.5 


10. 


5 


675 


9 


US-09-733-630-2 


Sequence 2, Appli 


11 


306 


10. 


3 


486 


14 


US-10-156-761-12818 


Sequence 12818, A 


12 


306 


10. 


3 


664 


14 


US-10-119-988-12 


Sequence 12, Appl 


13 


298.5 


10. 


0 


675 


9 


US-09-928-530-2 


Sequence 2, Appli 


14 


298.5 


10. 


0 


675 


14 


US-10-162-012-27 


Sequence 27, Appl 


15 


298.5 


10. 


0 


675 


15 


US-10-162-102-27 


Sequence 27, Appl 


16 


297.5 


10. 


0 


471 


12 


US-10-282-122A-5272 5 


Sequence 52725, A 


17 


295 


9. 


9 


596 


14 


US-10-119-988-8 


Sequence 8, Appli 


18 


292 


9. 


8 


672 


9 


US-09-928-530-5 


Sequence 5, Appli 


19 


292 


9. 


8 


672 


14 


US-10-162-012-30 


Sequence 30, Appl 


20 


292 


9. 


8 


672 


15 


US-10-162-102-30 


Sequence 30, Appl 


21 


287 


9. 


7 


678 


12 


US-10-072-012-438 


Sequence 438, App 


22 


281.5 


9. 


5 


673 


12 


US-10-072-012-440 


Sequence 44 0, App 


23 


279 


9. 


4 


454 


12 


US-10-282-122A-53545 


Sequence 53545, A 


24 


277.5 


9. 


3 


596 


9 


US-09-740-O26A-2 


Sequence 2, Appli 


25 


277.5 


9. 


3 


596 


12 


US-10-072-012-114 


Sequence 114, App 


26 


277.5 


9. 


3 


596 


12 


US-10-169-395-124 


Sequence 124, App 


27 


277.5 


9. 


3 


596 


14 


US-10-237-859-2 


Sequence 2, Appli 


28 


277.5 


9. 


3 


643 


14 


US-10-119-988-5 


Sequence 5, Appli 


29 


277 


9. 


3 


596 


9 


US-09-740-026A-4 


Sequence 4, Appli 


30 


277 


9. 


3 


596 


14 


US-10-237-859-4 


Sequence 4, Appli 


31 


277 


9. 


3 


597 


12 


US-10-072-012-436 


Sequence 436, App 


32 


272.5 


9. 


2 


524 


9 


US-09-738-626-6949 


Sequence 6949, Ap 


33 


272.5 


9. 


2 


524 


12 


US-10-627-476-496 


Sequence 4 96, App 


34 


269.5 


9. 


1 


612 


12 


US-10-072-012-116 


Sequence 116, App 


35 


269.5 


9. 


1 


664 


14 


US-10-119-9B8-2 


Sequence 2, Appli 


36 


262.5 


8. 


8 


477 


12 


US-10-282-122A-67179 


Sequence 67179, A 


37 


262 


8. 


8 


512 


15 


US-10-161-493-36 


Sequence 36, Appl 


38 


260 


8. 


7 


552 


12 


US-10-072-012-439 


Sequence 439, App 


39 


260 


8. 


7 


674 


14 


US-10-173-123-9 


Sequence 9, Appli 


40 


260 


8. 


7 


681 


14 


US-10-173-123-7 


Sequence 7, Appli 


41 


260 


8. 


7 


752 


15 


US-10-297-022-20 


Sequence 20, Appl 


42 


254 


8. 


5 


561 


12 


US-10-072-012-118 


Sequence 118, App 


43 


249 


8. 


4 


479 


12 


US-10-282-122A-46830 


Sequence 46830, A 


44 


248 


8, 


3 


742 


15 


US-10-297-022-15 


Sequence 15, Appl 


45 


245 


8. 


2 


738 


14 


US-10-173-123-13 


Sequence 13, Appl 



ALIGNMENTS 



RESULT 1 

US-09-911-077A-2 

; Sequence 2, Application US/09911077A 

; Publication No. US20030114399A1 

; GENERAL INFORMATION : 

; APPLICANT: BLAKELY, RANDY D. 



APPLICANT: APPARSUNDARAM, SUBRAMANIAM 
APPLICANT: FERGUSON, SHAWN 

TITLE OF INVENTION: HUMAN AND MOUSE CHOLINE TRANSPORTER cDNA 
FILE REFERENCE: VBLT:008US 

CURRENT APPLICATION NUMBER: US/09/911, 077A 
CURRENT FILING DATE: 2001-07-23 
NUMBER OF SEQ ID NOS : 27 
SOFTWARE: Patentln Ver. 2.1 
SEQ ID NO 2 
LENGTH: 580 
TYPE: PRT 

ORGANISM: Homo sapiens 
US-09-911-077A-2 

Query Match 100.0%; Score 2972; DB 10; Length 580; 

Best Local Similarity 100.0%; Pred. No. 2.6e-269; 

Matches 58 0; Conservative 0; Mismatches 0; Indels 0; Gaps 0; 

Qy 1 MAFH VE G L I AI I VF Y L L I L L VG I WAAW RT KN S G S AE ERS EAI I VGG RD I GL LVG G FTMT A 60 

| | | | | | I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 1 MAFHVEGLIAIIVFYLLILLVGIWAAWRTKNSGSAEERSEAIIVGGRDIGLLVGGFTMTA 60 

Qy 61 TWVGGGYINGTAEAVYVPGYGLAWAQAPIGYSLSLILGGLFFAKPMRSKGYVTMLDPFQQ 120 

| I I I I I I M I I I I I I I I I I I I I I I I I I I M I M I I I I I I I I I I I I I I I I I I M M M I I I 

Db 61 TWVGGGYINGTAEAVYVPGYGLAWAQAP I GYS LS LI LGGLFFAKPMRS KGYVTMLDP FQQ 12 0 

Qy 121 I YGKRMGGLLFI PALMGEMFWAAAI FSALGATI SVI I DVDMHI SVI I SALI ATLYTLVGG 180 

| | I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 121 I YGKRMGGLLFI PALMGEMFWAAAI FSALGATI SVI I DVDMHI SVI I SALI ATLYTLVGG 180 

Qy 181 L YS VAYT DWQ L FC I FVGLWI S VP FAL S H P AVAD I G FT AVHAK YQ K PWL GT VD S S EVY S W 240 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I M I I I I 

Db 181 LYSVAYTDWQLFCIFVGLWISVPFALSHPAVADIGFTAVKAKYQKPWLGTVDSSEVYSW 240 

Qy 241 LDS FLLLMLGGI PWQAYFQRVLS S S SAT YAQVLS FLAAFGCLVMAI PAI LI GAI GASTDW 300 

MM I I II I I I I I I I I I I I I I I I i I I I I I I I I I I I II 

Db 241 LDS FLLLMLGGI PWQAYFQRVLS SS SAT YAQVLS FLAAFGCLVMAI PAI LI GAI GASTDW 300 

Qy 301 NQTAYGL P D P KTT EEADMI L P I VLQ YLC P VY I S FFGLGAVS AAVMS SAD S S I L S AS SMFA 360 

| | II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 301 NQTAYGLPDPKTTEEADMILPIVLQYLCPVYISFFGLGAVSAAVMSS7UDSSILSASSMFA 360 

Qy 361 RNIYQLSFRQNASDKEI\^WVMRITVFVFGASATAMALLTKTVYGLWYLSSDLVYIVIFPQ 420 

| II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 361 RNIYQLSFRQNASDKEIVWVMRITVFVFGASATAMALLTKTVYGLWYLSSDLVYIVIFPQ 420 

Qy 421 LLCVLFVKGTNTYGAVAGYVSGLFLRITGGEPYLYLQPLIFYPGYYPDDNGIYNQKFPFK 480 

I M II I II I II I I I I M I I I I II M I I M N I I I 

Db 421 LLCVLFVKGTNTYGAVAGYVSGLFLRITGGEPYLYLQPLIFYPGYYPDDNGIYNQKFPFK 480 

Qy 481 TLAMVTSFLTNICISYLAKYLFESGTLPPKLDVFDAVVARHSEENMDKTILVKNENIKLD 540 

| || M I I II II I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I 

Db 481 T LAMVT S FLTN I C I S YLAK Y LFESGTLPP K LDVFDAWARH S E ENMD KT I LVKN EN I KLD 540 

Qy 541 ELALVKPRQSMTLSSTFTNKEAFLDVDSSPEGSGTEDNLQ 58 0 

|| | | I I I I M I I II I I I I II II I II I I M I II I I I I I I I I 
Db 541 ELALVKPRQSMTLSSTFTNKEAFLDVDSSPEGSGTEDNLQ 580 



RESULT 2 

US-09-911-077A-10 

; Sequence 10, Application US/09911077A 

; Publication No. US20030114399A1 

; GENERAL INFORMATION: 

; APPLICANT: BLAKELY, RANDY D. 

; APPLICANT: APPARSUNDARAM, SUBRAMANIAM 

; APPLICANT: FERGUSON, SHAWN 

; TITLE OF INVENTION: HUMAN AND MOUSE CHOLINE TRANSPORTER cDNA 
; FILE REFERENCE: VBLT:008US 

; CURRENT APPLICATION NUMBER: US/09/911, 077A 
; CURRENT FILING DATE: 2001-07-23 
; NUMBER OF SEQ ID NOS : 27 

SOFTWARE: Patentln Ver. 2.1 
; SEQ ID NO 10 
; LENGTH: 58 0 
TYPE: PRT 

ORGANISM: Homo sapiens 
US-09-911-077A-10 

Query Match 100.0%; Score 2972; DB 10; Length 580; 

Best Local Similarity 100.0%; Pred. No. 2.6e-269; 

Matches 580; Conservative 0; Mismatches 0; Indels 0; Gaps 0; 

1 MAFHVEGLIAI IVFYLLILLVGIWAAWRTKNSGSAEERSEAIIVGGRDIGLLVGGFTMTA 60 

| I I I I I I I I I I I I I I | I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I M 

1 MAFHVEGLIAI IVFYLLILLVGIWAAWRTKNSGSAEERSEAI I VGGRDIGLLVGGFTMTA 60 

61 TWGGGYINGTAE^VWPGYGIJ\WAQAPIGYSLSLILGGLFFAKPMRSKGYVTMLDPFQQ 12 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I M I I I I I I I 

61 TWVGGGYINGTAEAVYVPGYGLAWAQAPIGYSLSLILGGLFFAKPMRSKGYVTMLDPFQQ 12 0 

121 I YGKRMGGLLFI PALMGEMFWAAAI FSALGATI SVI I DVDMHI SVI I SALIATLYTLVGG 18 0 

| | | I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

121 IYGKRMGGLLFI PALMGEMFWAAAI FSALGATI SVI I DVDMHI SVI I SALIATLYTLVGG 18 0 

181 L Y S VAYT D WQL FC I FVG LW I S VP FAL S H PAVAD I G FT AVHAK YQK P WL GT VD S S EVY S W 240 

| | | | | | | | | I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
181 L YS VAYT D WQ LFC I FVGLW I S VP FAL S H PAVAD I GFT AVHAKYQK PWLGT VD S S EVYSW 240 

241 L D S F L L LML G G I P WQ AY FQ RVL S S S SAT Y AQ VL S F LAAFG C L VMAI P AI L I GAI GAS T DW 300 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

241 LDS FLLLMLGGI PWQAYFQRVLS S S SATYAQVLS FLAAFGCLVMAI PAI LI GAI GASTDW 300 

301 NQTAYGLPDPKTTEEADMI LPI VLQYLCPVYI S FFGLGAVSAAVMS SADS S I LSASSMFA 360 

| | | | II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I > 

301 NQTAYGLPDPKTTEEADMI LPI VLQYLCPVYI S FFGLGAVSAAVMS SAD S S I LSAS SMFA 360 

361 RNIYQLSFRQNASDKEIVWVMRITVFVFGASATAMALLTKTVYGLWYLSSDLVYIVIFPQ 420 

M I I I II I I I II I I I I I I I II II II I I I I I I I I I I M II I I II IN I MM 

361 RNIYQLSFRQNASDKEIVWVMRITVFVFGASATAMALLTKTVYGLWYLSSDLVYIVIFPQ 420 

421 LLCVLFVKGTNTYGAVAGYVSGLFLRITGGEPYLYLQPLI FYPGYYPDDNGI YNQKFPFK 4 8 0 

I I I I I I I I I I I I I I I I I I | I I I I I M I I I I I I M I I M I I I I I II I I I I M I I I ! 

421 LLCVLFVKGTNTYGAVAGYVSGLFLRITGGEPYLYLQPLI FYPGYYPDDNGI YNQKFPFK 48 0 



Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 
Db 

Qy 

Db 



Qy 481 T LAMVT S FLTN I C I S YLAKYLFES GT L P PKL DVFDAWARHS EENMDKT I L VKNEN I KLD 540 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 481 TLAMVTSFLTNICISYIJVKYLFESGTLPPKLDVFDAVVARHSEENMDKTILVKNENIKLD 540 

Qy 541 ELALVKPRQSMTLSSTFTNKEAFLDVDSSPEGSGTEDNLQ 580 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 541 ELALVKPRQSMTLSSTFTNKEAFLDVDSSPEGSGTEDNLQ 580 



RESULT 3 

US-09-911-077A-11 

; Sequence 11, Application US/09911077A 

; Publication No. US20030114 399A1 

; GENERAL INFORMATION: 

; APPLICANT: BLAKELY, RANDY D. 

; APPLICANT: APPARSUNDARAM, SUBRAMANIAM 

; APPLICANT: FERGUSON, SHAWN 

; TITLE OF INVENTION: HUMAN AND MOUSE CHOLINE TRANSPORTER cDNA 
; FILE REFERENCE: VBLT:008US 

; CURRENT APPLICATION NUMBER: US/09/911 , 077A 

; CURRENT FILING DATE: 2001-07-23 

; NUMBER OF SEQ ID NOS : 27 

; SOFTWARE: Patentln Ver. 2.1 

; SEQ ID NO 11 

; LENGTH: 580 

TYPE: PRT 
; ORGANISM: Homo sapiens 
US-09-911-077A-11 

Query Match 100.0%; Score 2972; DB 10; Length 580; 

Best Local Similarity 100.0%; Pred. No. 2.6e-269; 

Matches 580; Conservative 0; Mismatches 0; Indels 0; Gaps 0; 



Qy 1 MAFHVEGLIAIIVFYLLILLVGIWAAWRTKNSGSAEERSEAIIVGGRDIGLLVGGFTMTA 60 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 1 MAFHVEGLIAIIVFYLLILLVGIWAAWRTKNSGSAEERSEAIIVGGRDIGLLVGGFTMTA 60 

Qy 61 TWVGGG YI N GT AEAVYVP G YGLAWAQAP I G YS L S L I LGGL F FAKPMRS KG YVTML D P FQQ 120 

II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I 

Db 61 TWGGGYINGTAEAVYVPGYGLAWAQAPIGYSLSLILGGLFFAKPMRSKGYVTMLDPFQQ 120 

Qy 121 I YGKRMGGLLFI PALMGEMFWAAAI FSALGATI SVI IDVDMHI SVI I SALIATLYTLVGG 180 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 121 I YGKRMGGLLFI PALMGEMFWAAAI FSALGATI SVI IDVDMHI SVI I SALIATLYTLVGG 180 

Qy 181 L Y S VAYT DWQ L FC I FVG LW I S VP FAL S H P AVAD I G FT AVHAK YQ K P W L GT VD S S E VY S W 240 

I I 1 1 1 1 I 1 1 1 1 1 i 1 1 I I 1 1 1 1 1 1 I 1 1 1 1 1 1 1 I I 1 1 1 1 I I 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 I I 1 1 

Db 181 LYSVAYTDWQLFCIFVGLWISVPFALSHPAVADIGFTAVHAKYQKPWLGTVDSSEVYSW 240 

Qy 241 L D S FL L LMLGG I P WQAY FQ RVL S S S SAT YAQ VL S FLAAFGC L VMAI PAI L I GAI GAS TDW 300 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 241 LDS FLLLMLGGI PWQAYFQRVLS S S SAT YAQ VLS FLAAFGC LVMAI PAI LI GAI GASTDW 300 

Qy 301 NQTAYGLPDPKTTEEADMILPIVLQYLCPVYISFFGLGAVSAAVMSSADSSILSASSMFA 360 

I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 301 NQTAYGLPDPKTTEEADMI LPI VLQYLCPVYI S FFGLGAVSAAVMS SADS S ILSAS SMFA 360 



Ov 


361 


Db 


361 


Ov 


421 


Db 


421 


Qy 


481 


Db 


481 


QY 


541 


Db 


541 



I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I 

RN I YQ L S FRQN AS D K E I VWVMR I T VFVFGAS AT AMAL LT KT VYG LW YL S S D L VY I VI F P Q 420 
LLCVLFVKGTNTYGAVAGYVSGLFLRITGGEPYLYLQPLIFYPGYYPDDNGIYNQKFPFK 48 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I N I I I I I I I I 

LLCVLFVKGTNTYGAVAGYVSGLFLRITGGEPYLYLQPLIFYPGYYPDDNGIYNQKFPFK 480 

T LAMVT S FLTN I C I S YLAK YL FE S GT L P P KL DVFDAWARH S E ENMD KT I LVKN EN I KLD 54 0 
| | | | I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I M I I I I I I 
TLAMVTSFLTNICISYLAKYLFESGTLPPKLDVFDAWARHSEENMDKTILVKNENIKLD 540 

ELALVKPRQSMTLSSTFTNKEAFLDVDSSPEGSGTEDNLQ 580 

| I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

ELALVKPRQSMTLSSTFTNKEAFLDVDSSPEGSGTEDNLQ 580 



RESULT 4 

US-09-911-077A-12 

; Sequence 12, Application US/09911077A 

; Publication No. US2 00301 143 99A1 

; GENERAL INFORMATION: 

; APPLICANT: BLAKELY, RANDY D. 

; APPLICANT: APPARSUNDARAM, SUBRAMANIAM 

; APPLICANT: FERGUSON, SHAWN 

; TITLE OF INVENTION: HUMAN AND MOUSE CHOLINE TRANSPORTER cDNA 

FILE REFERENCE: VBLT:008US 
; CURRENT APPLICATION NUMBER: US/09/911, 077A 

CURRENT FILING DATE: 2001-07-23 
; NUMBER OF SEQ ID NOS : 27 

SOFTWARE: Patentln Ver. 2.1 
; SEQ ID NO 12 
LENGTH: 580 
TYPE: PRT 
; ORGANISM: Homo sapiens 
US-09-911-077A-12 

Query Match 100.0%; Score 2972; DB 10; Length 580; 

Best Local Similarity 100.0%; Pred. No. 2.6e-269; 

Matches 580; Conservative 0; Mismatches 0; Indels 0; Gaps 0, 

1 MAFHVEGLIAIIVFYLLILLVGIWAAWRTKNSGSAEERSEAIIVGGRDIGLLVGGFTMTA 60 

| | | | | I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

1 MAFH VEG L I AI I VF Y L L I L L VG I W AAW RT KN S G S AE E R S EAI I VG G RD I GL L VGG FTMT A 60 

61 TWVGGGYINGTAEAVYVPGYGLAWAQAPIGYSLSLILGGLFFAKPMRSKGYVTMLDPFQQ 120 

| | | | I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M 
61 TWVGGGY I NGT AEAVYVP GYGLAWAQAP I G YS LS L I L GGL F FAK PMRS KG YVTMLD P FQQ 12 0 

121 I YGKRMGGLLFI PALMGEMFWAAAI FSALGATI SVI IDVDMHI SVI I SALIATLYTLVGG 180 

| | | | I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I M I I I II I I II I I I I II 
121 I YGKRMGGLLFI PALMGEMFWAAAI FSALGATI SVI IDVDMHI SVI I SALIATLYTLVGG 180 

181 L Y S VAYT DWQ L FC I FVG LW I S VP F AL S H P AVAD I G FT AVHAK Y Q K P W L GT VD S S E VY S W 240 

| | | | | I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I II I I I I I I I I 
181 LYSVAYTDWQLFCIFVGLWISVPFALSHPAVADIGFTAVHAKYQKPWLGTVDSSEVYSW 240 



Qy 
Db 

Qy 
Db 

Qy 

Db 

Qy 

Db 



Qy 


241 


LDS FLLLMLGGI PWQAYFQRVLS S S SAT YAQVLS FLAAFGCLVMAI PAI LI GAI GASTDW 


300 




1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 III 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 U 




Db 


241 


LDS FLLLMLGGI PWQAYr QRVLobbbAI YAQVJLibJ? LiAAfc bLLVMAl FA1 J_il bAibAo 1 JJW 




Qy 


301 


NQTAYGL P DP KTT EEADMI L P I VLQYLC PVYI S FFGLGAVS AAVMS S ADS S I LS AS SMFA 


360 




I | I I 1 I I I I I I | | I I I I I I I I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


301 


lTAmn Tf/iT nnni/mm i-inTt r\-KAT T nTI TT /"\WT t~* TW T\JT C TT 1 TPS'* T TK\ TC 7\ 7\ "\ ThAC CAHCCTT O A O C? TVA XT' A 

NQTAYGL PDP KTT EEADMI LP I VLQYLCPVYI S FFGLGAVS AAVMS SADS S I LSAS bMc A 




QV 


361 


RN I YQLS FRQNAS DKE I VWVMRI TVFVFGASATAMALLTKTVYGLW YLS S DLVYI VI FPQ 


420 




I I I I I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 i 1 1 1 1 1 1 1 1 1 1 




Db 


361 


RNIYQLSFRQNASDKEIVWVMRITVFVFGASATAMALLTKTVYGLWYLSSDLVYIVIFPQ 


/ion 


QY 


421 


LLCVLFVKGTNTYGAVAGYVSGLFLRITGGEPYLYLQPLIFYPGYYPDDNGIYNQKFPFK 


480 




I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


421 


LLCVLFVKGTNTYGAVAGYVSGLFLRITGGEPYLYLQPLIFYPGYYPDDNGIYNQKFPFK 


a q n 
4 o U 


Qy 


481 


TLAMVTSFLTNICISYLAKYLFESGTLPPKLDVfc DAVVAKHbhJ-iNMDis. 1 IJ_iVA.JNiliNlrv.ijU 






I I I I I I I 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


481 


TLAJMVTSFLTNICISYLAKYLFESGTLPPKLDVFDAWARHSEENMDKTILVKNENIKLD 


540 


Qy 


541 


ELALVKPRQSMTLSSTFTNKEAFLDVDSSPEGSGTEDNLQ 580 






1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


541 


ELALVKPRQSMTLSSTFTNKEAFLDVDSSPEGSGTEDNLQ 580 





RESULT 5 

US-09-911-077A-6 

; Sequence 6, Application US/09911077A 

; Publication No. US20030114399A1 

; GENERAL INFORMATION: 

; APPLICANT: BLAKELY, RANDY D. 

; APPLICANT: APPARSUNDARAM, SUBRAMANIAM 

; APPLICANT: FERGUSON, SHAWN 

; TITLE OF INVENTION: HUMAN AND MOUSE CHOLINE TRANSPORTER cDNA 
; FILE REFERENCE: VBLT:008US 

; CURRENT APPLICATION NUMBER: US/09/911, 077A 

; CURRENT FILING DATE: 2001-07-23 

; NUMBER OF SEQ ID NOS : 27 

; SOFTWARE : Patentln Ver. 2.1 

; SEQ ID NO 6 

; LENGTH: 580 

; TYPE: PRT 

; ORGANISM: Rattus norvegicus 
US-09-911-077A-6 

Query Match 94.9%; Score 2820; DB 10; Length 580; 

Best Local Similarity 93.1%; Pred. No. 4.5e-255; 

Matches 540; Conservative 24; Mismatches 16; Indels 0; Gaps 0; 

Qy 1 MAFHVEGLIAIIVFYLLILLVGIWAAWRTKNSGSAEERSEAIIVGGRDIGLLVGGFTMTA 60 

I | | | | I I : I I I : I I I I I I I I I II II : I I I I I : I I I I I I I I I I I I I I I II I I I I I I I I I 

Db 1 MP FHVEGLVAI I LFYLL I FLVGI WAAWKTKNS GNAEERS EAI I VGGRDI GLLVGGFTMTA 60 

Qy 61 TWVGGGYINGTAEAVYVPGYGLAWAQAPIGYSLSLILGGLFFAKPMRSKGYVTMLDPFQQ 12 0 

| I I I I I I II I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 61 TWVGGGYINGTAEAVYGPGCGLAWAQAPIGYSLSLILGGLFFAKPMRSKGYVTMLDPFQQ 120 



Qy 

Db 



121 
121 



I YGKRMGGLLFI PALMGEMFWAAAI F SAL GAT IS VI I DVDMHI SVI I SALIATLYTLVGG 180 

I I I I I I I I I I I I I I I Ml I I I I I I I I I M I I.I I I I I II M I I I I : I I I I I I I I I I I I 

I YGKRMGGLLFI PALMGEMFWAAAI FSALGATI SVI I DVDVNI SVI VSALIAI LYTLVGG 180 



Qy 181 L YS VAYT DWQ L FC I FVGLWI S VP FAL S H P AVAD I GFT AVHAKYQKPWLGTVD S S EVYS W 240 

I I I I I I I I I I I I I I I I : I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I : : I I I I : I 
Db 181 LYS VAYT D WQ LFC I FI GLWI S VP FAL S H P AVT D I GFT AVHAK YQ S PWLGT I E S VEVYTW 240 

Qy 241 L DS FL L LML GG I P WQA Y FQRVL S S S SAT YAQVL S FLAAFG C L VMAI P AI L I GAI GAS T DW 300 

I I : I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I : I I I I I I I I I I I I I 
Db 241 L DN FL L LML GG I PWQAY FQ RVL S S S SAT YAQVL S FLAAFG C L VMAL P AI C I GAI GAS T DW 300 

Qy 301 NQTAYGLPDPKTTEEADMILPIVLQYLCPVYISFFGLGAVSAAVMSSADSSILSASSMFA 360 

I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 301 NQTAYGFPDPKTKEEADMI LPI VLQYLCPVYI S FFGLGAVSAAVMS SADS S I LSAS SMFA 360 

Qy 361 RNI YQLS FRQNAS DKEI VWVMRITVFVFGASATAMALLTKTVYGLWYLS SDLVYI VI FPQ 420 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I : I I I I 

Db 361 RNI YQLS FRQNAS DKEI VWVMRITVFVFGASATAM7VLLTKTVYGLWYLS SDLVYI 1 1 FPQ 420 

Qy 421 LLCVLFVKGTNTYGAVAGYVSGLFLRITGGEPYLYLQPLIFYPGYYPDDNGIYNQKFPFK 480 

I I I I I I : I I I I I I I I I I I I : I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I : I I II 
Db 421 LLCVLFIKGTNTYGAVAGYIFGLFLRITGGEPYLYLQPLIFYPGYYPDKNGIYNQRFPFK 480 

Qy 481 TLAMVTSFLTNICISYLAKYLFESGTLPPKLDVFDAWARHSEENMDKTILVKNENIKLD 540 

I I : I I I I I I I I I : I I I I I I I I I I I I I I I I I I : I I I I I : I I I I I I I II I I I I : I I I I I I : 

Db 481 T LSMVT S F FTN I CVS Y LAK YLFE SGTLPPKLDI FDAWS RH S EENMD KT I LVRN EN I KLN 540 

Qy 541 ELALVKPRQSMTLSSTFTNKEAFLDVDSSPEGSGTEDNLQ 580 

III II II II : II M II I II II I I I I I I II I I I II I I I I 
Db 541 ELAPVKPRQSLTLSSTFTNKEALLDVDSSPEGSGTEDNLQ 580 



RESULT 6 

US-09-911-077A-4 

; Sequence 4, Application US/09911077A 

; Publication No. US20030114399A1 

; GENERAL INFORMATION: 

; APPLICANT: BLAKELY, RANDY D. 

; APPLICANT: AP PARSUNDARAM, SUBRAMANIAM 

; APPLICANT: FERGUSON, SHAWN 

; TITLE OF INVENTION: HUMAN AND MOUSE CHOLINE TRANSPORTER cDNA 
; FILE REFERENCE: VBLT:008US 

; CURRENT APPLICATION NUMBER: US/09/911 , 077A 

; CURRENT FILING DATE: 2001-07-23 

; NUMBER OF SEQ ID NOS : 27 

; SOFTWARE: Patentln Ver. 2.1 

; SEQ ID NO 4 

LENGTH: 580 

TYPE: PRT 
; ORGANISM: Mus mus cuius 
US-09-911-077A-4 

Query Match 94.0%; Score 2795; DB 10; Length 580; 

Best Local Similarity 92.6%; Pred. No. 9.8e-253; 

Matches 537; Conservative 23; Mismatches 20; Indels 0; Gaps 0; 



Qy 


1 


MAFHVEGLIAIIVFYLLILLVGIWAAWRTKNSGSAEERSEAIIVGGRDIGLLVGGFTMTA 


60 




1 1 1 II 1 1 : 1 1 1 : 1 1 II 1 1 1 1 1 1 1 1 1 : 1 1 1 1 1 : 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


1 


MPFHVEGLVAIILFYLLIFLVGIWAAWKTKNSGNPEbKbhiAll^ -LJYL1A 


u w 


Qy 


61 


T WGGG Y I NGTAEAVYVP G YGLAWAQAP I G YS L S L I LGGL F FAKPMRS KG YVTMLD P FQQ 


120 




1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 : 1 




Db 


61 


TWVGGGYINGTAEAVYGPGCGj^WAnArlGYSLoLlLbbijr £ /\J\rJYlKDr\iji V 1 JYHiUr z ts.y 


1 ? n 


Qy 


121 


I YGKRMGGLLFI PALMGEMFWAAAI F SAL GAT IS VI I DVDMHI SVI I SALI ATLYTLVGG 


180 




1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 :: 1 1 1 1 : 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


121 


I YGKRMGGLLFI PALMGEMFWAAAI h SALGAI ISVIIDVDVJnIdVI VbAUlAllii 1 LV(jKj 




Qy 


181 


LYSVAYTDWQLFCIFVGLWISVPFALSHPAVADIGFTAVHAKYQKPWLGTVDSSEVYSW 


240 




1 I I I 1 1 1 1 1 1 1 1 1 1 1 1 : 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 :: 1 1 1 1 : 1 




Db 


181 


LYSVAYTDVVQLFCIFIGLwISVPFALSHPAWDIGFTAVnAKYUbFWLCjl __-bvlwilW 


9 4 0 
_ 4 U 


Qy 


241 


LDS FLLLMLGGI PWQAYFQRVLS S SSAT YAQVLS FLAAFGCLVMAI PAI LI GAI GASTDW 


300 




1 I : I I I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 : 1 M 1 1 II 1 1 1 1 1 1 




Db 


241 


LDN FLLLMLGGI PWQAYFQRVLS S S SAT YAQVLS FLAAFGCLVMAL PAI C 1 GAI GAo 1 JJW 


-inn 


Qy 


301 


NQT AYGL P D P KTT EEADMI L P I VLQ YLC P VYI S FFGLGAVS AAVMS SAD S S I L S AS SMFA 


360 






I I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


301 


NQTAYGYPDPKTKEEADMILPIVLQYLCPWISFFGLGAVSAAVMSSADbblLbAbbMhA 




Qy 


361 


RN I YQL S FRQNAS DKEI VWVMRI T VFVFGASATAMALLTKTVYGLWYLS S DLVYI VI FPQ 


420 




1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 M 1 M 1 1 1 1 1 1 1 1 1 1 : 1 1 1 1 




Db 


361 


RN I YQ L S FRQNAS DKE I VWVMRI TVLVFGAS AT AMAL L T KTVYG LWY L S S D LV Y 1 1 1 £ F U 


/ion 


Qy 


421 


LLCVLFVKGTNTYGAVAGYVSGLFLRITGGEPYLYLQPLIFYPGYYPDDNGIYNQKFPFK 


480 




1 1 1 1 1 1 : 1 1 1 1 1 1 1 1 1 1 1 1 : 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 II : 1 1 1 1 




Db 


421 


LLCVLFIKGTNTYGAVAGYI FGLFLRITGGEPYLYLQPL1 1 YPGx Y bUKJNCjl iwynr Ft r\ 


4 ft n 


Qy 


481 


—it - Tun rm r< t-it (thittptpvt 7\ wt T7> — orfnT nrit/T pvi fT?n7\Tn^7\DUC TT T?KT"M"Pi VT 1 T T \ 7"VTvT TT M T W T Fi 

TLAMVTSFLTNICISYLAKYLFESGI LrrKLDVr UAV VAKnorjliiNJyilJi\l 1 LiVJ\I>JriiN 1 i\Ltu 






11:11111 | | | | : 1 1 1 1 1 1 1 1 1 1 1 1 1 I 1 1 1 1 I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 : 1 1 1 1 1 1 : 




Db 


481 


TLSMVTSFFTNICVSYLAKYLFESGTLPPKLDVFDAWARHSEENMDKTILVRNENIKLN 


540 


Qy 


541 


ELALVKPRQSMTLSSTFTNKEAFLDVDSSPEGSGTEDNLQ 58 0 








III 1 1 1 1 1 1 : 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 




Db 


541 


ELAPVKPRQSLTLSSTFTNKEALLDVDSSPEGSGTEDNLQ 58 0 





RESULT 7 

US-09-911-077A-24 

; Sequence 24, Application US/09911077A 

; Publication No. US20030114 399A1 

; GENERAL INFORMATION: 

; APPLICANT: BLAKELY, RANDY D. 

; APPLICANT: APPARSUNDARAM, SUBRAMANIAM 

; APPLICANT: FERGUSON, SHAWN 

; TITLE OF INVENTION: HUMAN AND MOUSE CHOLINE TRANSPORTER cDNA 
; FILE REFERENCE: VBLT:008US 

; CURRENT APPLICATION NUMBER: US/09/911 , 077A 
; CURRENT FILING DATE: 2001-07-23 
; NUMBER OF SEQ ID NOS : 27 

SOFTWARE: PatentlnVer. 2.1 
; SEQ ID NO 24 



LENGTH: 580 
TYPE: PRT 

ORGANISM: Mus mus cuius 
US-09-911-077A-24 

Query Match 94.0%; Score 2795; DB 10; Length 580; 

Best Local Similarity 92.6%; Pred. No. 9.8e-253; 

Matches 537; Conservative 23; Mismatches 20; Indels 0; Gaps 0; 

Qy 1 MAFHVEGLIAIIVFYLLILLVGIWAAWRTKNSGSAEERSEAIIVGGRDIGLLVGGFTMTA 60 

I I I I I I I : I I I : I I I I I I I I I I I I I : I I II I : I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 1 MPFHVEGLVAIILFYLLIFLVGIWAAWKTKNSGNPEERSEAIIVGGRDIGLLVGGFTMTA 60 

Qy 61 TWVGGGYI NGT AEAVYVPGYGLAWAQAP I G YS L S L I LGGL FFAK PMRS KG YVTMLD P FQQ 120 

I I I I I I I I I I I I II I I II I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II : I 

Db 61 TWVGGGYINGTAEAVYGPGCGLAWAHAPIGYSLSLILGGLFFAKPMRSKGYVTMLDPFKQ 120 

Qy 121 I YGKRMGGLLFI PALMGEMFWAAAI FSALGATI SVI I DVDMHI SVI I SALIATLYTLVGG 180 

I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I II : I I I I I II I I II I 
Db 121 I YGKRMGGLLFI PALMGEMFWAAAI FSALGATI SVI I DVD VNI SVI VSALIAILYTLVGG 180 

Qy 181 L Y S VAYT DWQ L FC I FVGLW I S VP FAL S H P AVAD I G FT AVHAK YQ K P WL GT VD S S E VY S W 240 

I I I I I M I I I I I I I I I : I I I I I I M I I I I I I I I I II I I I I II I I Mill::! I I I : I 

Db 181 LYSVAYTDWQLFCIFIGLWISVPFALSHPAVTDIGFTAVHAKYQSPWLGTIESVEVYTW 240 

Qy 241 LDS FLLLMLGGI PWQAYFQRVLS S S SATYAQVLSFLAAFGCLVMAI PAI LIGAI GASTDW 300 

II : I I I I I I I I I II I I I I I I I I I I I I I I I I I I II I I I I I I I I M I : I I I I I I I I I I I I I 

Db 241 LDN FLLLMLGGI PWQAYFQRVLS SS SAT YAQVLS FLAAFGCLVMALPAI CIGAI GASTDW 300 

Qy 301 NQT AYGL P D PKTTEEADMI L P I VLQ YLC PVYI S FFGLGAVSAAVMS S ADS S I LSAS SMFA 360 

I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 301 NQTAYGYPDPKTKEEADMILPIVLQYLCPVYIS FFGLGAVSAAVMS SADS SI LSAS SMFA 360 

Qy 361 RNIYQLSFRQNASDKEIVWVMRITVFVFGASATAMAliLTKTVYGLWYLSSDLVYIVIFPQ 420 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I : I I I I 
Db 361 RNIYQLSFRQNASDKEIVWVMRITVLVFGASATAMALLTKTVYGLWYLSSDLVYIIIFPQ 420 

Qy 421 LLCVLFVKGTNTYGAVAGYVSGLFLRITGGEPYLYLQPLIFYPGYYPDDNGIYNQKFPFK 480 

I I I I I I : I I I I I I I I I I I I : 1 I II I I 1 II I II I I I I II I II I I I I I 111111:1111 
Db 421 LLCVLFIKGTNTYGAVAGYIFGLFLRITGGEPYLYLQPLIFYPGYYSDKNGIYNQRFPFK 4 80 

Qy 481 TLAMVTSFLTNICISYLAKYLFESGTLPPKLDVFDAVVARHSEENMDKTILVKNENIKLD 540 

11:11111 I M I : I M I II I M M I I I I I II I I I M I I I M II I I I I I I I I : I I I I I I : 
Db 481 TLSMVTSFFTNICVSYLAKYLFESGTLPPKLDVFDAWARHSEENMDKTILVRNENIKLN 540 

Qy 541 ELALVKPRQSMTLSSTFTNKEAFLDVDSSPEGSGTEDNLQ 580 

Ml II I I II : II II II I II II I I I I I I I I I I I I I I I I I 

Db 541 ELAPVKPRQSLTLSSTFTNKEALLDVDSSPEGSGTEDNLQ 580 



RESULT 8 
US-10-241-784-2 

; Sequence 2, Application US/10241784 
; Publication No. US20040048261A1 
; GENERAL INFORMATION: 
; APPLICANT: Bayer Corporation 



TITLE OF INVENTION: Invertebrate Choline Transporter Nucleic Acid, 
Polypeptides and Uses 

TITLE OF INVENTION: Thereof 
FILE REFERENCE: M07218 

CURRENT APPLICATION NUMBER: US/ 10/24 1 , 784 
CURRENT FILING DATE: 2002-09-11 
NUMBER OF SEQ ID NOS : 2 
SOFTWARE: Patentln version 3.1 
SEQ ID NO 2 
LENGTH: 610 
TYPE: PRT 

ORGANISM: Drosophila melanogaster 
US-10-241-784-2 

Query Match 50.7%; Score 1506.5; DB 12; Length 610; 

Best Local Similarity 56.8%; Pred. No. 5.3e-132; 

Matches 293; Conservative 84; Mismatches 120; Indels 19; Gaps 7; 

Qy 4 HVEGLIAI IVFYLLILLVGIWAAWRTKNSGSAEERSEAI IVGGRDIGLLVGGFTMTATWV 63 

: : I : : : I : : I I I I I I : I I I I I MM: I I ::: I I I I I I I I I I I I I I I 
Db 3 NIAGWSIVLFYLLILWGIWAG-RKKQSGNDSE — EEVMLAGRSIGLFVGIFTMTATWV 59 

Qy 64 GGGYINGTAEAVYVPGYGLAWAQAPIGYSLSLILGGLFFAKPMRSKGYVTMLDPFQQIYG 123 

11111111111:1 I M I I M I : I I I : I I : I I I I I M I M I I I M : I 
Db 60 GGGYINGTAEAI YT S — GLVWCQAP FGYALS LVFGGI FFANPMRKQGYI TMLDPLQDS FG 117 

Qy 124 KRMGGLLFI PALMGEMFWAAAI FSALGATI SVI IDVDMHI SVI I SALIATLYTLVGGLYS 183 

: | I I I I I I : I I I I I : I I I I I : I I I I I : I I I I I : I I I I : I : II I I I M M I 
Db 118 ERMGGLLFLPALCGEVFWAAGILAALGATLSVIIDMDHRTSVILSSCIAI FYTLFGGLYS 177 

Qy 184 VAYT D WQ L FC I FVGLW I S VP FAX S H P AVAD I G FT AVHAK YQ K P WL GT VD S S EVY S W L D S 243 

I I I I I I : I I I I I I : I I I :: I I I I : I : = : I : I I = : : : : I 

Db 178 VAYTDVIQLFCIFIGLWMCIPFAWSNEHVGSL SDLEVDWIGHVEPKKHWLYIDY 231 

Qy 244 FLLLMLGGI PWQAYFQRVLS S S SAT YAQVLS FLAAFGCLVMAI PAI LI GAI GASTDWNQT 303 

| | | : | | | | | | | | | | : : : : I I I I : : I I I M I I I I I : II I : I 

Db 232 GLLLVFGGI PWQVYFQR QNGRKGPAS AYVAAAGCI LMAI P PVLI GAI AKAT PWNET 287 

Qy 304 AYGLPDPKTTEEADMILPIVLQYLCPVYISFFGLGAVSAAVMSSADSSILSASSMFARNI 363 

I I I I : I I II I : I II I I I :: I I I I I I I I I I I I I I I I I I I : I I I : I I I I I I : 

Db 288 DYKGPYPLTVDETSMILPMVLQYLTPDFVSFFGLGAVSAAVMSSADSSVLSAASMFARNV 347 

Qy 364 YQLSFRQNASDKEIVWMRITVFVFGASATAMALLTKTVYGLWYLSSDLVYIVIFPQLLC 423 

| : M I I I I M I : II I I : : I I I I I I I : : I I I I : I I I I I : : : I I I I I 
Db 348 YKLIFRQKASEMEIIWVMRVAII VVGILATIMALTIPSIYGLWSMCSDLVYVILFPQLLM 407 

Qy 424 VL-FVKGTNTYGAVAGYVSGLFLRITGGEPYLYLQPLIFYPGYYPDDNGIYNQKFPFKTL 482 

|:|| | | | I ::: I : I : I :: I I I I I I I I I I I I I I I I I : I : 

Db 408 WHFKKHCNT YGSLSAYI VALAI RLSGGEAI LGLAPLI KYPGY DEETKEQMFPFRTM 4 64 

Qy 483 AMVTSFLTNICISYIAKYLFESGTLPPKLDVFDAW 518 

M : I : I I : I : I : I I I I I II M II 
Db 465 AMLLSLVTLI SVSWWTKMMFESGKLPPSYDYFRCW 500 



RESULT 9 

US-09-911-077A-8 



; Sequence 8, Application US/09911077A 

; Publication No. US20030114399A1 

/ GENERAL INFORMATION: 

; APPLICANT: BLAKELY, RANDY D. 

; APPLICANT: APPARSUNDARAM, SUBRAMANIAM 

; APPLICANT: FERGUSON, SHAWN 

; TITLE OF INVENTION: HUMAN AND MOUSE CHOLINE TRANSPORTER cDNA 
; FILE REFERENCE: VBLT:008US 

; CURRENT APPLICATION NUMBER: US/09/911, 077A 

; CURRENT FILING DATE: 2001-07-23 

; NUMBER OF SEQ ID NOS : 27 

; SOFTWARE: Patentln Ver. 2.1 

; SEQ ID NO 8 

LENGTH: 576 
; TYPE: PRT 

; ORGANISM: Caenorhabditis elegans 
US-09-911-077A-8 

Query Match 48.9%; Score 1453; DB 10; Length 576; 

Best Local Similarity 50.5%; Pred. No. 5e-127; 

Matches 295; Conservative 95; Mismatches 150; Indels 44; Gaps 9 

Qy 7 GLIAIIVFYLLILLVGIWAAWRTKNSGSAEER S EAI I VGGRD I GL LVG G FTMT AT W 62 

I : : I I : I I : I I I : I I I I I : : I : I I : I : : : I I : M I I I I II I I i I 

Db 6 GIVAI VFFYVLI LWGIWAGRKS KS S KELES EAGAATEEVMLAGRNI GTLVGI FTMTATW 65 

Qy 63 VGGGY I NGT AEAVYVPGYGLAWAQAP I G Y S L S L I LGGL FFAKPMRS KG YVTMLD P FQQI Y 122 

I I I I I I I I I I I : I II II I : I I :: I I :: I I I I I I I I : I I : I I I I I I I I 

Db 66 VGGAYINGTAEALY — NGGLLGCQAPVGYAISLVMGGLLFAKKMREEGYITMLDPFQHKY 123 

Qy 123 GKRMGGLLFIPALMGEMFWAAAI FSALGATI SVI IDVDMHISVI I SALIATLYTLVGGLY 182 

: I : I I I : : : I II : I I || III I I I I I I : I I I : : I I : II : I I II II II I 
Db 124 GQRIGGLMYVPALLGETFWTAAILSALGATLSVILGIDMNASVTLSACIAVFYTFTGGYY 183 

Qy 183 S VAYT D WQL FC I FVGLW I S VP FAL S H P AVAD I G FT AVHAK YQK PWLGT VD S - S EVY S W L 241 

: I I I I I I I I II I I I I I I I : I I I : I II I I : I : I I : 

Db 184 AVAYT DWQ L FC I FVG LWVC VP AAMVH DGAK D I S RN A GDWIGEIGGFKETSLWI 237 

Qy 242 DS FLLLMLGGI PWQAYFQRVLS S S SATYAQVLS FLAAFGCLVMAI PAI LIGAI GASTDWN 301 

I I I I : I I I I II I II I I I I I : I I I I I I : I II :: II I I I I I I I : I I I 
Db 238 DCMLLLVFGGIPWQVYFQRVLSSKTAHGAQTLSFVAGVGCILMAIPPALIGAIARNTDWR 297 

Qy 302 QTAYGLPDPKTTEEA DMI LP I VLQYLCPVYI SFFGLGAVSAAVMS SADS S I LSA 355 

II : I I : : I :: I : I I I I I ::: I I I I I I I I I I I I I I I I I : I I I 

Db 298 MTDYSPWNNGTKVESIPPDKRNMWPLVFQYLTPRWVAFIGLGAVSAAVMSSADSSVLSA 357 

Qy 356 S SMFARNI YQL S FRQNAS DKEI VWVMRI TVFVFGAS ATAMALLTKTVYGLWYLS S DLVYI 415 

: I I I I I I : : I : I : I I : I I : : I I I I : I I I I I I : : : I I I I I I : I I I I : 
Db 358 ASMFAHNIWKLT I RPHAS EKEVI I VMRI AI I CVGIMAT IMALT I QS I YGLWYLCADLVYV 417 

Qy 416 VI FPQLLCVLFVKGTNT YGAVAGYVSGLFLRITGGEPYLYLQPLI FYPGYYPDDNGI YNQ 475 

: : I I I I I I I : : : : I I I I : : I I I I I I I : I I I I : I II I : I : I 
Db 418 ILFPQLLCWYMPRSNTYGSLAGYAVGLVLRLIGGEPLVSLPAFFHYPMY TDGV— Q 472 

Qy 476 KFPFKTLAMVTSFLTNICISYLAKYLFESGTLPPKLDVFDAW ARHSEENMDKTILV 532 

I I I : I I I : : I I : I : : I I : I I I I : I I II I I : I 

Db 473 YFPFRTTAMLSSMATIYIVSIQSEKLFKSGRLSPEWDVMGCWNIPIDHVPLPSDVSFAV 532 



Qy 533 KNENIKL DELALVKPRQSMTLSSTFTN 559 

: I : : I I I : I : I I : I 

Db 533 SSETLNMKAPNGTPAPVHPNQQPSDENTLLHPYSDQSYYSTNSN 576 



RESULT 10 
US-09-733-630-2 

Sequence 2, Application US/09733630 
Patent No. US20020034799A1 
GENERAL INFORMATION: 
APPLICANT: Donoho, Gregory 
APPLICANT: Scoville, John 
APPLICANT: Turner, C. Alexander Jr. 
APPLICANT: Freidrich, Glenn 
APPLICANT: Zambrowicz, Brian 
APPLICANT: Abuin, Alejandro 
APPLICANT: Sands, Arthur T. 

TITLE OF INVENTION: No. US20020034799Alel Human Transporter Protein and 
TITLE OF INVENTION: Polynucleotides Encoding the Same 
FILE REFERENCE: LEX-0106-USA 
CURRENT APPLICATION NUMBER: US/09/733, 630 
CURRENT FILING DATE: 2000-12-08 
PRIOR APPLICATION NUMBER: US 60/170,137 
PRIOR FILING DATE: 1999-12-10 
NUMBER OF SEQ ID NOS: 3 

SOFTWARE: FastSEQ for Windows Version 4.0 
SEQ ID NO 2 
LENGTH: 675 
TYPE: PRT 

ORGANISM: Homo sapiens 
US-09-733-630-2 

Query Match 10.5%; Score 311.5; DB 9; Length 675; 

Best Local Similarity 23.0%; Pred. No. 5.3e-20; 

Matches 152; Conservative 112; Mismatches 235; Indels 161; Gaps 28; 

Qy 2 AFHVEGL IAIIVFY-LLIL L VG I WAAW RT KN S G S AE E RS EAI I VG G RD I G L L VGG F 56 

II : I I | | : : I | I : | | | : | : : | | : : : I I : I 

Db 18 AFPQKGLEPGDIAVLVLYFLFVLAVGLWSTVKTK RDTVKGYFLAGGDMVWWPVGA 72 

Qy 57 TMTATWVGGGY I NGTAEAVYVPGYGLAWAQAP I G YS LS L I - LGGLF FAKPMRS 108 

: : I : I I I : I I I : I I I : I I II I I : 

Db 73 SLFASNVGSGHF IGLAGSSAATGISVSAYELNGLFSVLMLAWIFLPIYI 121 

Qy 109 KGYVTMLDPFQQI YGKRMGGLLFI PALMGEMFWAAAI FSAL GAT-ISVIIDVDM 161 

III: : : MM: II:: : : I I : : III : : I : 

Db 122 AGQVTTMPEYLR KRFGGI R- 1 P 1 1 LAVLYLFI YI FT KI S VDMYAGAI FI QQS LHLDL 177 

Qy 162 HISVIISALIATLYTLVGGLYSVAYTDWQLFCIFVGLWISVPFALSHPAVADIGFTAVH 221 

: : : : : I : I I : I I I : I I I I : I : : I : : I II I : 
Db 178 YIAI VGLLAITAVYTVAGGLAAVI YTDALQTLIMLI GALTLMGY- - S FAAVG — GMEGLK 233 

Qy 222 AKY QKPWLGTVDSSEVYS-WLDSFLLLML 249 

II III: : I I 

Db 234 EKYFLALASNRSENSSCGLPREDAFHIFRDPLTSDLPWPGVLFGMSIPSLWY 285 



Qy 2 50 GGIPW QAY FQRVL S S S SAT YAQVLS FLAAFGC LVMAI PAI L I GAI GAS T DWNQTAYG 306 

I I M I : : : : : I : : : I I : : : : I : : I I 

Db 2 86 WCTDQVIVQRTLAAKNLSHAKGGALMAAYLKVLPLFIMVFPGMVSRILFPDQVA — 339 

Qy 307 LPDPKTTEE ADMI L P I VLQ YLC P VYI S FFGLGAVSAAVMS SADS S I LSAS SM 358 

II::: : | : | : : | | : : : I I : I I I I III:: 

Db 34 0 C7U)PEICQKICSNPSGCSDIAYPKLVLELLPTGLRGLWQv^VMVAALMSSLTSIFNSASTI 399 

Qy 359 FARN I YQL S FRQNAS DKE I VWVMRI T VFVFGAS AT AMALLT KT VYGLW 406 

I : : : I I I : I I : : I I : I II III 

Db 400 FTMDLWN-HLRPRASEKELMIVGRVFV LLLVLVSILWIPWQASQGGQL 447 

Qy 407 — YLSSDLVYI VIFPQLLCVLFVKGTNTYGAVAGYVSGLFLRITG-GEPYLYLQP 458 

I : I I : 1:1 : I I I I II I : I I I I : : : I : I I 

Db 44 8 FIYIQSISSYLQPPVAWF IMGCFWKRTNEKGAFWGLISGLLLGLVRLVLDFIYVQP 504 

Qy 459 LIFYPGYYPDDNGIYNQKFPFKTLAMVTSFLTNICISYLAKYLFESGTLPPKLDV 513 

II: : : : : I : I : I I : I : : : III : : 

Db 505 RC DQPDERPVLVKSIHYLYFSMILSTVTLITVSTVSWF TEPPSKEMVSHLT 555 

Qy 514 FDAWARHS EENMDKT I LVKNEN I KLD ELALVKPRQSMTLSSTFTNKEA 562 

III: I : : I : : :|: III I I : : 

Db 5 5 6 WFT RHD P WQKEQAP PAAP-L S LT L S QNGMP EAS S S S S VQ FEMVQENT S KT HS CDMT PKQ S 615 



RESULT 11 

US-10-156-761-12818 

Sequence 12818., Application US/10156761 
Publication No. US20030119018A1 
GENERAL INFORMATION: 
APPLICANT : OMURA, SATOSHI 
APPLICANT: IKEDA, HARUO 
APPLICANT: ISHIKAWA, JUN 
APPLICANT: HORIKAWA, HIROSHI 
APPLICANT: SHIBA, TADAYOSHI 
APPLICANT: SAKAKI, YOSHIYUKI 
APPLICANT: HATTORI, MAS AH IRA 
TITLE OF INVENTION: NOVEL POLYNUCLEOTIDES 
FILE REFERENCE: 249-262 

CURRENT APPLICATION NUMBER: US/10/156,761 
CURRENT FILING DATE: 2002-05-29 
PRIOR APPLICATION NUMBER: JP 2001^204089 
PRIOR FILING DATE: 2001-05-30 
PRIOR APPLICATION NUMBER: JP 2001-272697 
PRIOR FILING DATE: 2001-08-02 
NUMBER OF SEQ ID NOS : 15109 
SEQ ID NO 12818 
LENGTH: 486 
TYPE: PRT 

ORGANISM: St rept omyces avermitilis 
US-10-156-761-12818 

Query Match 10.3%; Score 306; DB 14; Length 486; 

Best Local Similarity 27.7%; Pred. No. l.le-19; 

Matches 130; Conservative 84; Mismatches 194; Indels 62; Gaps 20; 



Qy 



11 1 1 VF YL L - 1 L L VG I WAAW RT KN S G S AE E R S EAI I VGG RD I G L L VG G FTMT AT WVGGG Y I N 69 



:|| II : I : I I I h :l M : I : HI : ' ' 

Db 7 VIWYLAGMLAMGWWGMRRAKSKSD FLVAGRRLGPAMYSGTMAAIVLGGASTI 59 

Qy 70 GTAEAVYVPGYGLAWAQAPIGYSLSLILGGLFFAKPMRSKGYVTMLDPFQQIYGKRMGGL 129 

| || II I I I I I | | : | I I I : : I I I I 

Db 60 GGVGLGYKYGLSGAWMVFAIGLGL-LALSVFFSARIARLKVY-TVS EMLDLRYGGRAG-- 115 

Qy 130 LFI PALMGEMFWAAAI FSALGATI S VI I DVDMHI SVI I SALIATLYTLVGGLYS 183 

: | : || : |: :||: |: |:: :::|: I |: :||::| 

Db H6 VISGVVl^AYTLMI^VTSTIAYATIFDVLFDl^RTLAIILGGSIWAYSTLGGlWS 171 

Qy 184 VAYT DWQL FC I FVG- LWI S VP FAL S H P AVAD I G FT AVHAK YQKPWLGTVDSSEVY 238 

: M : | | : | | : : | | : I I I : I : I I I I I I : : : 

Db 172 ITLTDMVQFWKTIGVLLLLLPIAI VKAGGFSAMKAKLPTEYFDP-LG-IGGETIF 225 

Qy 239 SWLDS FLLLMLGGI PWQAYFQRVLS S SSAT YAQVLS FLAAFGCLVMAI PAI LI GAI GAST 298 

: : : | : I : I : I I I : : I I I : : I III I = : M 

Db 22 6 TYV LIYTFGMLIGQDIWQRVFTARSDTTAKWGGTVAGTYCLVYALAGAVIG 27 6 

Qy 299 DWNQTAYGLPDPKTTEEADMILPIVLQYLCPVYISFFGLGAVSAAVMSSADSSILSASSM 358 

|| : M || ::: I I : II Mill:: ::::::: 

Db 277 T AAKVL Y P - T L P SAD S AFAT I VKDE L P VGVRGLVLAAALAAVMS T S S GAL I AC AT V 331 

Q y 359 FARNIYQ L S FRQNAS D KE I VWVMRI T VFVFGAS ATAMAL - LT KT VYGLW YL S S DLV 413 

: | : : I I : I : I I : : I I : I : I II : I I 

Db 332 ANNDIWSRLRGVSSRK-GDDHDEVRGNRLFILVMGVAVICTAIALNDWEALTVAYNLLV 390 

Qy 414 YIVIFPQLLCVLFVKGTNTYGAVAGYVSG LFLRITGG EPYLY 455 

: : I I : I : : I I : I I : I : I : I I I I I I 

Db 391 GGLLVP I LGGLLWKRGT- VH GALAS VI VGGLAVI GLMAT FGI LANEPVYY 439 



RESULT 12 
US-10-119-988-12 

; Sequence 12, Application US/10119988 

; Publication No. US20030054453A1 

; GENERAL INFORMATION: 

; APPLICANT: Curtis, Rory A.J. 

; APPLICANT: Chen, Hong 

APPLICANT: Millennium Pharmaceuticals Inc. 
; TITLE OF INVENTION: 68723, Sodium/Glucose Cotransporter 
; TITLE OF INVENTION: Family Members and Uses Therefor 

FILE REFERENCE: MPI01-103P1RNM 
; CURRENT APPLICATION NUMBER: US/10/119 , 988 
; CURRENT FILING DATE: 2002-04-10 
; PRIOR APPLICATION NUMBER: 60/282,764 
; PRIOR FILING DATE: 2001-04-10 
; NUMBER OF SEQ ID NOS : 18 

; SOFTWARE: FastSEQ for Windows Version 4.0 
; SEQ ID NO 12 

LENGTH: 664 
; TYPE: PRT 

ORGANISM: Homo sapiens 
US-10-119-988-12 



Query Match 10.3%; Score 306; DB 14; Length 664; 

Best Local Similarity 22.8%; Pred. No. 1.7e-19; 



Matches 148; Conservative 104; Mismatches 218; Indels 178; Gaps 



Qy 11 IIVFYLLILLVGIWAAWRTKNSGSAEERSEAIIVGGRDIGLLVGGFTMTATWVGGGYING 70 

| :::::::: | | : I I : I I I : : I I : | : : | : : I I : I 

Db 32 IVIYFVWMAVGLWAMFST-NRGTV GGFFLAGRSMVWWPIGASLFASNIGSGHFVG 8 6 

Qy 71 TAEAVYVPGYGLAWAQAPIGYS LSLILGGLFFAKPMRSK-GYVTMLDPFQQI YGK 124 

| | | | II: I : : I I I I I : I I I I I : I 

87 GTGAASGIAIGGFEWNALVLVWLGWLFV— PIYIKAGWTM PEYLRK 134 



Db 

QY 
Db 



Qy 



125 RMGG LLFI PALMGEMFWAAAI FSALGATI SVI I DVDMHI SVI I SAL I A 172 

Ml | | : | : : : I I I | :::::::::: : I 

135 RFGGQRIQVYLSLLSLLLYIFTKISADIFSGAIF INLALGLNLYLAI FLLLAIT 188 



Qy 173 TLYTLVGGLYSVAYTDVVQLFCIFVGLWISVPFALSHPAVADIGFTAV11AKYQK PWL- 229 

III: III s I Ml : I : I I I II I : I Ml I : 

Db 189 ALYTITGGLAAVIYTDTLQTVIMLVGSLILTGFAFHEVG GYDAFMEKYMKAI PT I V 244 

Qy 230 GTVDSSEVYS-WLDSFLLL MLGGIPW QAYFQRVLSS 264 

| : | : Ml: : I : I I I N I I : 

Db 245 SDGNTTFQEKCYTPRADSFHIFRDPLTGDLPWPGFIFGMSILTLWYWCTDQVIVQRCLSA 304 



265 SSATYAQ VLS FLAAFGCLVMAI PAI L IGAI GASTDWNQT 303 

:| :| :: I : I 

Db 305 KNMSHVKGGCILCGYLKLMPMFIMVMPGMISRILYTEKIACWPSECEKYCGTKVGCTNI 364 



Qy 304 AYGLPDPKTTEE7VDMILPIVXQYLCPVYISFFGLGAVSAAVMSSADSSILSASSMFARNI 363 

|| | :: | I : I : I : : I I I I |||::| :l 

Db 36 5 AY PT L WE LMPN GLRGLML S VMLAS LMS S LT S I FN S AS T L FTMD I 409 

Qy 364 YQL S FRQNAS DKE I VWVMRI T VFV- FGAS AT AMALLT KTVYG LWYLS S DLVY I VI F 418 

| | : | I : I I : : I : : I I I : : : I 1=1 I : I 

Db 410 Y-AKVRKRASEKELMIAGRLFILVLIGISIAWVPIVQSAQSGQLFDYIQSITSYLGPPIA 4 68 

Qy 419 PQLLCVLFVKGTNTYGAVAGYVSGLFLRI TG GEPYLY 455 

I : I I I I I I : I I : I I I \ \W 

Db 469 AVFLLAIFWKRVNEPGAFWGLILGLLIGISRMITEFAYGTGSCMEPSNCPTIICGVHYLY 528 

Qy 456 LQPLIFYPGYYPDDNGIYNQKFPFKTLAMVTSFLTNICISYLAKYLFESGTLPPKLDVFD 515 

: : | I I : I : I I I I : I : : : 

Db 529 FAIILF AISFITIWISLLTKPI PDVHLYR 558 

Qy 516 AV — VARHS EENMDKTI LVKNENI KLDELALVKPRQSMTLS STFTNKE 561 

: I I : I : : I I I : | : : : : : : I : 

Db 559 LCWSLRNSKEERID— LDAEEENIQ EGPKETIEIETQVPEKK 598 



RESULT 13 
US-09-928-530-2 

; Sequence 2, Application US/09928530 

; Patent No. US2 002 015 6002 Al 

; GENERAL INFORMATION: 

; APPLICANT: Curtis, Rory A. J. 

; APPLICANT: Silos-Santiago, Inmaculada 

; TITLE OF INVENTION: 32 620, A NOVEL HUMAN SODIUM-SUGAR 
; TITLE OF INVENTION: SYMPORTER FAMILY MEMBER AND USES THEREOF 
FILE REFERENCE: 10446-080001 



; CURRENT APPLICATION NUMBER: US/09/928 , 530 

; CURRENT FILING DATE: 2001-08-13 

; PRIOR APPLICATION NUMBER: 60/227,068 

; PRIOR FILING DATE: 2000-08-22 

; NUMBER OF SEQ ID NOS : 7 

SOFTWARE: FastSEQ for Windows Version 4.0 
; SEQ ID NO 2 
; LENGTH: 675 
TYPE : PRT 

ORGANISM: Homo sapiens 
US-09-928-530-2 

Query Match 10.0%; Score 298.5; DB 9; Length 675; 

Best Local Similarity 22.7%; Pred. No. 8.7e-19; 

Matches 149; Conservative 115; Mismatches 238; Indels 155; Gaps 29; 

Qy 2 AFHVEGL IAIIVFY-LLILLVGIWAAWRTKNSGSAEERSEAIIVGGRDIGLLVGGF 56 

I I : i I | | : : | | | : | | | : I : : | | : : I : II 

Db 18 AFPQKGLEPGDIAVLVLYFLFVLAVGLWSTVKTKR DT VKG Y FLAE GNMVWWP VGA- 72 

Qy 57 TMTAT WVGGG Y I NGT AEAVYVP G YGLAWAQAP I G YS L S LILGGLFFAKPMRSKGY 111 

: : I : I I I : I I III : I I : I : I : I I : I 

Db 73 SLFASNVGSGHFIGLA -GSGAATGISVSAYELNGLFSVLMLAWIFL--PIYIAGQ 124 

Qy 112 VTMLDP FQQI YGKRMGGLLFI PALMGEMFWAAAI FS ALGAT I SVIID VDMHIS 164 

||::: I I I I : I I : : : : I I : : : : : I : I : : : : 

Db 125 VTTMPEYLR KRFGGI R-I PI I LAVLYLFI YI FTKI SVDMYAGAI FIQQSSHLDLYLA 180 

Qy 165 VI I SALIATLYTLVGGLYSVAYTDWQLFCI FVGLWI SVPFALSHPAVADI GFTAVHAKY 224 

: : I : II : I I I : I I I I : I : : I : : I II I : II 
Db 181 I VGLLAI TAVYTVAGGLAAVI YTDALQTLIMLIGALTLMGY — S FAAVG — GMEGLKEKY 236 

Qy 225 QKPWLGTVDSSEVYS-WLDSFLLLMLGGI 252 

III: : I I 

Db 237 FLALASNRSENSSCGLPREDAFHIFRDPLTSDLPWPGVLFGMSIPSLWY 285 

Qy 253 PW QAYFQRVLSS S SATYAQVLS FLAAFGCLVMAI PAI LI GAI GASTDWNQTAYGLPD 309 

I I I I 1 : : : : : I : : : I I : : : I : : I I I 

Db 286 -WCTDQVIVQRTLAAKNLSHAKGGALMAAYLKVLPLFIMVFPGMVSRILFPDQVA — CAD 342 

Qy 310 PKTTEE ADMI LPI VLQ YLCPVYI S FFGLGAVSAAVMS SADS S I LSAS SMFAR 361 

I : : : : I : I : : II : : : I I : II I I I I I : : I 

Db 343 PEICQKICSNPSGCSDIAYPKLVLELLPTGLRGLMMAVWAALMSSLTSIFNSASTIFTM 402 

Qy 362 NIYQLSFRQNASDKEIWVMRITVFVFGASATAMALLTKTVYGLW Y 4 07 

:: : I I I : I I : : I I : I II III I 

Db 4 03 DLWN-HLRPRASEKELMIVGRVFV LLLVLVS I LWI P WQASQGGQLFI Y 450 

Qy 408 LSSDLVYI VI FPQLLCVLFVKGTNTYGAVAGYVSGLFLRI TG-GEPYLYLQPLI F 461 

: I I : I : I : I I I I II I : I I I I : : : I : I I 

Db 451 IQSISSYLQPPVAWF IMGCFWKRTNEKGAFWGLISGLLLGLVRLVLDFIYVQPRC- 506 

Qy 4 62 YPGYYPDDNGIYNQKFPFKTLAMVTSFLTNICISYLAKYLFESGTLPPKLDV 513 

II: : : : :|: I :| I :| :: : I II :: 

Db 507 DQPDERPVLVKSIHYLYFSMILSTVTLITVSTVSWF TEPPSKEMVSHLTWFT 558 

Qy 514 - FDAWARH S E ENMDKT I L VKN EN I KL D ELALVKPRQSMTLSSTFTNKEA 562 



Ill: | ::| : : :|: I I I I I:: 

Db 559 RHDPWQKEQAPPAAPLSLTLSQNGMPEASSSSSVQFEMVQENTSKTHSCDMTPKQS 615 



RESULT 14 
US-10-162-012-27 

; Sequence 27, Application US/10162012 

; Publication No. US20030051660A1 

; GENERAL INFORMATION: 

; APPLICANT: Curtis, Rory A.J. 

; APPLICANT: Silos-Santiago, Inmaculada 

; APPLICANT: Gu, Wei 

; TITLE OF INVENTION: NOVEL HUMAN ION CHANNEL AND TRANSPORTER FAMILY MEMBERS 

; FILE REFERENCE: 10448-190001 

; CURRENT APPLICATION NUMBER: US/ 10/ 162 , 012 

; CURRENT FILING DATE: 2002-06-04 

; PRIOR APPLICATION NUMBER: US 60/209,845 

; PRIOR FILING DATE: 2000-06-06 

; PRIOR APPLICATION NUMBER: US 09/875,321 

; PRIOR FILING DATE: 2001-06-06 

; PRIOR APPLICATION NUMBER: PCT/US01/1834 0 

; PRIOR FILING DATE: 2001-06-06 

; PRIOR APPLICATION NUMBER: US 60/209,257 

; PRIOR FILING DATE: 2000-06-05 

; PRIOR APPLICATION NUMBER: US 09/875,423 

; PRIOR FILING DATE: 2001-06-05 

; PRIOR APPLICATION NUMBER: PCT/US01/ 18398 

; PRIOR FILING DATE: 2001-06-05 

; PRIOR APPLICATION NUMBER: US 60/209,238 

; PRIOR FILING DATE: 2000-06-05 

; PRIOR APPLICATION NUMBER: US 09/875,363 

; PRIOR FILING DATE: 2001-06-05 

; PRIOR APPLICATION NUMBER: PCT/US01/18247 

; PRIOR FILING DATE: 2001-06-05 

; PRIOR APPLICATION NUMBER: US 60/227,068 

; PRIOR FILING DATE: 2000-08-22 

; PRIOR APPLICATION NUMBER: US 09/928,530 

PRIOR FILING DATE: 2001-08-13 

; PRIOR APPLICATION NUMBER: PCT/US01/2547 5 

; PRIOR FILING DATE: 2001-08-15 

; PRIOR APPLICATION NUMBER: US 60/226,770 

; PRIOR FILING DATE: 2000-08-21 

PRIOR APPLICATION NUMBER: US 09/934,421 

; PRIOR FILING DATE: 2001-08-21 

; PRIOR APPLICATION NUMBER: PCT/US01/2609 6 

; PRIOR FILING DATE: 2001-08-21 

; PRIOR APPLICATION NUMBER: US 60/279,281 

; PRIOR FILING DATE: 2001-03-28 

; PRIOR APPLICATION NUMBER: US 10/109,029 

; PRIOR FILING DATE: 2002-03-28 

; PRIOR APPLICATION NUMBER: PCT/US02/ 09728 

; PRIOR FILING DATE: 2002-03-28 

; PRIOR APPLICATION NUMBER: US 60/290,288 

; PRIOR FILING DATE: 2001-05-11 

; PRIOR APPLICATION NUMBER: US (not assigned) 

; PRIOR FILING DATE: 2002-05-13 

; NUMBER OF SEQ ID NOS : 4 8 



SOFTWARE: FastSEQ for Windows Version 4.0 
SEQ ID NO 27 
LENGTH: 675 
TYPE: PRT 

ORGANISM: Homo sapiens 
US-10-162-012-27 

Query Match 10.0%; Score 298.5; DB 14; Length 675; 

Best Local Similarity 22.7%; Pred. No. 8.7e-19; 

Matches 149; Conservative 115; Mismatches 238; Indels 155; Gaps 29; 

Qy 2 AFHVEGL IAIIVFY-LLILLVGIWAAWRTKNSGSAEERSEAIIVGGRDIGLLVGGF 56 

II : M M::| I I :l ll:|s :ll : : I " 

Db 18 AFPQKGLEPGDIAVLVLYFLFVLAVGLWSTVKTKR DTVKGYFLAEGNMVWWPVGA- 72 

Qy 57 TMT AT WVG GG Y I N GTAEAVYVP G YGLAWAQAP I G Y S L S LILGGLFFAKPMRSKGY 111 

: : I : I I I : I I III : I ■ I : I : I : I I = I 

Db 73 SLFASNVGSGHFIGLA GSGAATGISVSAYELNGLFSVLMLAWIFL— PIYIAGQ 124 

Qy 112 VTMLDPFQQI YGKRMGGLLFI PALMGEMFWAAAI FSALGATI SVIID VDMHIS 164 

||::: I I I I : I I : : : : II— • : : I : I : : : : 

Db 125 VTTMPEYLR— KRFGGIR-IPIILAVLYLFIYIFTKISVDMYAGAIFIQQSSHLDLYLA 180 

Qy 165 VI I S AL I AT L YT LVGGL YS VAYT DWQL FC I FVGLW I S VP FAL S H PAVAD I GFTAVHAKY 224 

: : | : I I : I I I : I I I I : I : : I : : I I I I : M 

Db 181 IVGLLAITAVYTVAGGLAAVI YTDALQTLIMLIGALTLMGY — SFAAVG — GMEGLKEKY 236 

225 QKPWLGTVDS SEVYS-WLDS FLLLMLGGI 252 

III: : I I 

237 FLALASNRSENSSCGLPREDAFHIFRDPLTSDLPWPGVLFGMSIPSLWY 285 

Q y 253 PW QAYFQRVLS S S SATYAQVLS FLAAFGCLVMAI PAI LI GAI GASTDWNQTAYGLPD 309 

| | | | | : : : : : | : : : I I : : : : I : : I I I 

Db 286 -WCTDQVIVQRTLAAKNLSHAKGGALMAAYLKVLPLFIMVFPGMVSRILFPDQVA--CAD 342 

Qy 310 PKTTEE ADMI LP I VLQYLCPVYI S FFGLGAVSAAVMS SADS S I LSAS SMFAR 361 

| : : : : | : I : : I I : : : I I : I I I I I I I : : I 

Db 343 PEICQKICSNPSGCSDIAYPKLVLELLPTGLRGLMMAVMVAALMSSLTSIFNSASTIFTM 402 

Qy 362 NIYQLSFRQNASDKEIVWVMRITVFVFGASATAMALLTKTVYGLW Y 407 

: : : I I I : I I : : I I : I I I I I I I 

Db 403 DLWN-HLRPRASEKELMIVGRVFV LLLVLVSILWIPWQASQGGQLFIY 450 

Qy 408 LSSDLVYI VI FPQLLCVLFVKGTNTYGAVAGYVSGLFLRITG-GEPYLYLQPLI F 461 

: | | : I : I : I I I I I I I : I I I I : : : I : I I 

Db 451 IQSISSYLQPPVAWF IMGCFWKRTNEKGAFWGLISGLLLGLVRLVLDFIYVQPRC- 506 

Q y 462 YPGYYPDDNGIYNQKFPFKTLAMVTSFLTNICISYLAKYLFESGTLPPKLDV 513 

||::: : : I : I : I I : I : : : III:: 
Db 507 DQ P DERPVLVKS I H YLYFSMI LST VT L I TVSTVSWF TEPPSKEMVSHLTWFT 558 

Qy 514 - FDAWARH S EENMDKT I LVKN EN I KLD E LALVKP RQ SMT L S S T FTN KEA 562 

|||: | ::| : : :|: I I I I I" 

Db 559 RH D P WQKEQAP P AAP L S LT L S QNGMP EAS S S S S VQ FEMVQENT S KTH S C DMT P KQ S 615 



Qy 

Db 



RESULT 15 



US-10-162-102-27 

; Sequence 27, Application US/10162102 

; Publication No. US20030232336A1 

; GENERAL INFORMATION: 

; APPLICANT: Curtis, Rory A.J. 

APPLICANT: Silos-Santiago, Inmaculada 
; APPLICANT: Gu, Wei 

; TITLE OF INVENTION: NOVEL HUMAN ION CHANNEL AND TRANSPORTER FAMILY MEMBERS 

FILE REFERENCE: 10448-190001 
; CURRENT APPLICATION NUMBER: US/10/162 , 102 
; CURRENT FILING DATE: 2003-04-04 
; PRIOR APPLICATION NUMBER: US 60/209,845 
; PRIOR FILING DATE: 2000-06-06 
; PRIOR APPLICATION NUMBER: US 09/875,321 

PRIOR FILING DATE: 2001-06-06 
; PRIOR APPLICATION NUMBER: PCT/US01/18340 
; PRIOR FILING DATE: 2001-06-06 
; PRIOR APPLICATION NUMBER: US 60/209,257 
; PRIOR FILING DATE: 2000-06-05 
; PRIOR APPLICATION NUMBER: US 09/875,423 
; PRIOR FILING DATE: 2001-06-05 
; PRIOR APPLICATION NUMBER: PCT/US01/18398 
; PRIOR FILING DATE: 2001-06-05 

PRIOR APPLICATION NUMBER: US 60/209,238 
; PRIOR FILING DATE: 2000-06-05 
; PRIOR APPLICATION NUMBER: US 09/875,363 
; PRIOR FILING DATE: 2001-06-05 
; PRIOR APPLICATION NUMBER: PCT/US01/18247 
; PRIOR FILING DATE: 2001-06-05 
; PRIOR APPLICATION NUMBER: US 60/227,068 
; PRIOR FILING DATE: 2000-08-22 

Remaining Prior Application data removed - See File Wrapper or PALM. 
; NUMBER OF SEQ ID NOS : 48 

; SOFTWARE: FastSEQ for Windows Version 4.0 
; SEQ ID NO 27 
; LENGTH: 675 
TYPE: PRT 

ORGANISM: Homo sapiens 
US-10-162-102-27 

Query Match 10.0%; Score 298.5; DB 15; Length 675; 

Best Local Similarity 22.7%; Pred. No. 8.7e-19; 

Matches 149; Conservative 115; Mismatches 238; Indels 155; Gaps 29; 

Qy 2 AFHVEGL IAIIVFY-LLILLVGIWAAWRTKNSGSAEERSEAIIVGGRDIGLLVGGF 56 

li : I I I I : : I I I : I II : I : : I I : : I : I I 

Db 18 AFPQKGLEPGDI AVLVLYFLFVLAVGLWSTVKTKR DTVKGYFLAEGNMVWWPVGA- 72 

Qy 57 TMTATWVGGGYINGTAEAVYVPGYGLAWAQAPIGYSLS LILGGLFFAKPMRSKGY 111 

: : I : I I I : I I I I I : I I : I : I : I I : I 

Db 73 SLFASNVGSGHFIGLA GSGAATGISVSAYELNGLFSVLMLAWIFL — PIYIAGQ 124 

Qy 112 VTMLDPFQQIYGKRMGGLLFIPALMGEMFWAAAIFSALGATI SVIID VDMHIS 164 

||::: Nihil:::: I I : : : : : I : h : : : 

Db 125 VTTMPEYLR KRFGGI R- 1 PI I LAVLYLFI YI FTKI SVDMYAGAI FIQQSSHLDLYLA 180 



Qy 



165 VI I SALIATLYTLVGGLYSVAYTDWQLFCI FVGLWI SVPFALSHPAVADI GFTAVHAKY 224 



: : I : I I : i I I : I I I I : I : : I : : I II I : II 

Db 181 I VGLLAI TAVYT VAG GLAAVT YT DALQT LIML I GALT LMG Y — S FAAVG — GMEG L KE K Y 236 

Qy 225 QKPWLGTVDSSEVYS-WLDSFLLLMLGGI 252 

Ml: : I I 

Db 237 FLALASNRSENSSCGLPREDAFHIFRDPLTSDLPWPGVLFGMSIPSLWY 285 

Qy 253 PW QAYFQRVLS S SSATYAQVLS FLAAFGCLVMAI PAI LI GAI GASTDWNQTAYGLPD 309, 

I I | | | : : : :: | : : : | | : :: : I : : | | | 

Db 286 -WCTDQVIVQRTLAAKNLSHAKGGALMAAYLKVLPLFIMVFPGMVSRILFPDQVA--CAD 342 

Qy 310 PKTTEE ADMI LP I VLQ YLC PVYI S FFGLGAVSAAVMS S ADS S I LSAS SMFAR 361 

I : : : : I : I : : II : : : I I : I I I I I I I : : I 

Db 343 PEICQKICSNPSGCSDIAYPKLVLELLPTGLRGLMMAVMVAALMSSLTSIFNSASTIFTM 402 

Qy 362 NIYQLSFRQNASDKEIVWVMRITVFVFGASATAMALLTKTVYGLW Y 407 

I I I : I I : : I I : I II III I 

Db 403 DLWN-HLRPRASEKELMIVGRVFV LLLVLVSILWIPWQASQGGQLFIY 450 

Qy 408 LSSDLVYI VIFPQLLCVLFVKGTNTYGAVAGYVSGLFLRITG-GEPYLYLQPLIF 4 61 

: I I : I : I : I I I I I I I : I I I I : : : I : I I 

Db 451 IQSISSYLQPPVAWF IMGCFWKRTNEKGAFWGLISGLLLGLVRLVLDFIYVQPRC- 506 

Qy 462 YPGYYPDDNGIYNQKFPFKTLAMVTSFLTNICISYLAKYLFESGTLPPKLDV 513 

II: : : : : I : I : I I : I : : : III : : 

Db 507 DQPDERPVLVKSIHYLYFSMILSTVTLITVSTVSWF TEPPSKEMVSHLTWFT 558 

Qy 514 - FDAWARHS EENMDKT I LVKNENI KLD ELALVKPRQSMTLSSTFTNKEA 562 

III: I : : I : : : I : III I I : : 

Db 559 RHDPWQKEQAPPAAPLSLTLSQNGMPEASSSSSVQFEMVQENTSKTHSCDMTPKQS 615 



Search completed: March 22, 2004, 15:45:44 
Job time : 536 sees 



GenCore version 5.1.6 
Copyright (c) 1993 - 2004 Compugen Ltd. 



OM protein - protein search, using sw model 

Run on: March 22, 2004, 15:17:05 ; Search time 105 Seconds 

(without alignments) 
1742.862 Million cell updates/sec 



Title: 



US-10-069-541-6 



Perfect score: 2972 
Sequence : 

Scoring table: 
Searched: 



1 MAFHVEGLI AI IVFYLLI LL EAFLDVDSSPEGSGTEDNLQ 580 

BLOSUM62 

Gapop 10.0 , Gapext 0.5 
1017041 seqs, 315518202 residues 



Total number of hits satisfying chosen parameters: 



1017041 



Minimum DB seq length: 0 

Maximum DB seq length: 2000000000 

Post-processing: Minimum Match 0% 

Maximum Match 100% 
Listing first 45 summaries 



Database : SPTREMBL 25:* 



1: 




sp archea:* 


2 




sp bacteria:* 


3 




sp_fungi : * 


4 




sp human:* 


5 




sp invertebrate:* 


6 




sp__mammal : * 


7 




sp mhc:* 


8 




sp organelle:* 


9 




sp phage:* 


10: 


sp plant:* 


11: 


sp rodent:* 


12: 


sp virus:* 


13 


sp vertebrate:* 


14 


sp unclassified:* 


15 


sp rvirus:* 


16 


: sp bacteriap:* 


17 


: sp archeap:* 



Pred. No. is the number of results predicted by chance to have a 
score greater than or equal to the score of the result being printed, 
and is derived by analysis of the total score distribution. 



SUMMARIES 

% 

Result Query 

No. Score Match Length DB ID Description 



1 
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0 
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4 


Q9GZV3 


nQnvu? Homo ^ani pn 


2 
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94 . 


9 


580 


11 
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4 
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11 
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4 
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5 
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9 
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11 
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6 
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75. 


8 
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13 
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7 
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5 
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13 
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8 


1557.5 


52 . 


4 


614 


5 


Q9VE4 b 


Vc4D UIUoujJiix±a 


9 


1530.5 


51. 


5 


579 


5 


Q9GPB1 


/^l Q V-\ T "1 -i mi i T 11 CI HO 1 


10 


1453 


48. 


9 


576 


5 


002228 


UuZZZo Caeiioindjjui 


11 


422.5 


14 . 


2 


484 


16 


Q /UFM6 


1 1 -f-TYi -rVir*HnTi"i rp 1 1 
\> f U. j_ulD jl iiumj^>J- x. c; J. _l 


12 


405.5 


13. 


6 


462 


16 


Q8EXG / 




13 


381.5 


12 . 


8 


479 


16 


Q8Y273 


yOyZ / -3 idli) LUllia o 


14 


344 


11. 


6 


492 


17 


Q9V2P3 




15 


334 


11. 


2 


492 


16 


Q81Y52 


yoiyOZ DdClllUb dll 


16 


333 


11. 


2 


493 


16 


Q81AD3 




17 


316 


10. 


6 


493 


17 


Q8U3Mo 


you omo py iuuul-uuo 


18 


314.5 


10. 


6 


665 


11 


Q9QXI6 


yyoxiD iiiuo uluo <^ u._L u 


19 


314.5 


10. 


6 


665 


11 


Q8C3K6 


r^. Q r-, 'J lr C m i i c mi l c Pll 1 11 

yocoKo mus musLuiu 


20 


312.5 


10. 


5 


665 


11 


Q9QXX5 


/~\ Qz-vTrT/ C. mi l CJ TY11 1 C r"ll 1 11 

yyqXXO IRU5 muocuiu 


21 


311.5 


10. 


5 


480 


16 


Q8ERF0 


riQo >^f O APoannhari 1 


22 


311 


10. 


5 


665 


11 


Q8CCA7 


Q8cca7 mus musculu 


23 


310.5 


10. 


,4 


675 


4 


Q8WWX8 


yowwxo nomo sap-Lcii 


24 


309 


10 . 


, 4 


670 


11 


Q923I7 


nOOQi "7 tyiiic mnQPill 11 

yyzoi / iuu o iuu. otuiu 


25 


308.5 


10, 


. 4 


675 


4 


Q86Y55 


yoOyj J nOulO bdp_Lfc:J.l 


26 


307.5 


10. 


. 3 


675 


4 


Q96PP5 


AO ^■•-vn ^ V\ /"NTY"!^ c a T\ ~l DTI 

yyoppO flOmO bapxcll 


27 


306 


10, 


,3 


486 


16 


Q82CQ9 


yozcqy suicpuomyL.c 


28 


304 


10, 


.2 


567 


16 


Q8EQQ5 


yoSQCjO OCtra.IltJJJclL^J--L 


29 


301 


10, 


. 1 


463 


16 


Q9I3S6 


/~\ Q -5 Op C r-\ t?/->il /-J /-\TYl/^iTl a C! 

yyi jsd pseucioxiioiicio 


30 


300 


10, 


. 1 


500 


16 


Q9CN55 


/^l Q y-~. v-s Q R C"t~011 1^P>1 "1 Pi 

y"C.IU J Udo Lcuiciia 


31 


299.5 


10. 


. 1 


507 


16 


Q9K9E2 


/~\ n u n ^ o v>. -^»»— ,-i i i He Vin 

QyjcyeZ Daciiius na 


32 


299.5 


10, 


.1 


546 


16 


Q8G6N6 


O £Z ii. V-x -i -P -J /^J /--\ Va a /-» +■ £i 

yogbnb DiiiuODdCLe 


33 


299 


10 


. 1 


662 


6 


Q9BDF6 




34 


299 


10 


. 1 


718 


11 


Q8 0WA5 


AQ Hi to ^ ra'h'hiic nA r^f 
yoUWaJ LdLLUb HUIV 


35 


297.5 


10 


. 0 


673 


11 


Q8K0E3 


/~\ Q lr fla "5 tniic mi i c pi i 1 i i 


36 


292 


9 


. 8 


698 


4 


Q8WY15 


yowyi j rxonio t>cipxt;ii 


37 


291. 5 


9 


. 8 


535 


16 


Q8FQ71 


nfif rfl1 r* t~ \;n <=» h r* t" <=* 


38 


291 


9 


. 8 


514 


17 


Q8TUR0 


/^Q-t-nv-H maf KariAca Kf 
yoCUJ-U Itie LildllUbal i_- 


39 


290.5 


9 


.8 


674 


6 


Q863B5 


rvQ ^ ^) Vn ^ nrwr-fnl Pi rf 11 
yOD jDj OiytL-uidyuc) 


40 


287 


9 


.7 


664 


c 
0 




Oftmk:h7 bos taurus 


41 


287 


9 


.7 


678 


11 


Q8VDT1 


Q8vdtl mus musculu 


42 


286.5 


9 


.6 


674 


6 


Q28728 


Q28728 oryctolagus 


43 


286 


9 


.6 


491 


17 


058753 


058753 pyrococcus 


44 


286 


9 


.6 


685 


11 


Q8BZW1 


Q8bzwl mus musculu 


45 


286 


9.6 


685 


11 


Q8BGU9 


Q8bgu9 mus musculu 



ALIGNMENTS 



RESULT 1 
Q9GZV3 

ID Q9GZV3 PRELIMINARY ; PRT; 580 AA. 

AC Q9GZV3; 

DT 01-MAR-2001 (TrEMBLrel . 16, Created) 

DT 01-MAR-2001 (TrEMBLrel. 16, Last sequence update) 

DT 01-JUN-2003 (TrEMBLrel . 24, Last annotation update) 



DE High affinity choline transporter (High-affinity choline transporter 

DE CHT1) . 

GN CHT1 OR SLC5A7 - 

OS Homo sapiens (Human) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 

OC Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo. 

OX NCBI_TaxID=9606; 

RN [1] 

RP SEQUENCE FROM N.A. 

RC TISSUE=Hypothalamus ; 

RA Bruess M. ; 

RL Submitted (AUG-2000) to the EMBL/ GenBank/DDBJ databases. 

RN [2] 

RP SEQUENCE FROM N.A. 

RC TISSUE=Hypothalamus ; 

RA Wieland A., Bonisch H., Bruss M. ; 

RT "Molecular cloning of the human and murine high affinity choline 

RT transporters and characterization of the human gene-structure."; 

RL Submitted (AUG-2000) to the EMBL/ GenBank/DDBJ databases. 

RN [3] 

RP SEQUENCE FROM N.A. 

RX MEDLINE=20483599; PubMed=11027560 ; 

RA Apparsundaram S., Ferguson S.M., George A.L. Jr., Blakely R.D.; 

RT "Molecular cloning of a human, hemicholinium-3-sensitive choline 

RT transporter . " ; 

RL Biochem. Biophys. Res. Commun. 276:862-867(2000). 

RN [4] 

RP SEQUENCE FROM N.A. 

RC TISSUE=Spinal cord; 

RX PubMed=l 1068039; 

RA Okuda T., Haga T. ; 

RT "Functional characterization of the human high-affinity choline 

RT transporter. "; 

RL FEBS Lett. 4 84:92-97(2000). 

RN [5] 

RP SEQUENCE FROM N.A. 

RA Bruess M. ; 

RL Submitted (JAN-2001) to the EMBL/ GenBank/DDBJ databases. 

RN [6] 

RP SEQUENCE FROM N.A. 

RA Wieland A., Bonisch H., Bruss M. ; 

RT "Molecular cloning of the human and murine high affinity choline 

RT transporters and characterizationof the human gene structure."; 

RL Submitted (JAN-2002) to the EMBL/ GenBank/DDBJ databases. 

DR EMBL; AJ401466; CAC03717.1; -. 

DR • EMBL; AF276871; AAG25940.1; 

DR EMBL; AB043997; BAB18161.1; -. 

DR EMBL; AJ308378; CAC8 8115.1; -. 

DR EMBL; AJ308379; CAC88115.1; JOINED. 

DR EMBL; AJ308380; CAC88115.1; JOINED. 

DR EMBL; AJ308381; CAC88115.1; JOINED. 

DR EMBL; AJ308382; CAC88115.1; JOINED. 

DR EMBL; AJ308383; CAC88115.1; JOINED. 

DR EMBL; AJ308384; CAC88115.1; JOINED. 

DR PIR; JC7502; JC7502. 

DR Genew; HGNC: 14025; SLC5A7 . 

DR GO; GO: 0005624; C:membrane fraction; NAS . 



DR GO; GO: 0015220; F: choline transporter activity; TAS . 

DR GO; GO: 0008292; P : acetylcholine biosynthesis; NAS „ 

DR InterPro; IPR001734; Na/solut_symport . 

DR Pfam; PF00474; SSF; 1. 

DR PROSITE; PS50283; NA_SOLUT_SYMP_3 ; 1. 

SQ SEQUENCE 580 AA; 63203 MW; 66CB35496CB6E2D6 CRC64; 

Query Match 100.0%; Score 2972; DB 4; Length 580; 

Best Local Similarity 100.0%; Pred. No. 5.3e-205; 

Matches 580; Conservative 0; Mismatches 0; Indels 0; Gaps 0; 

Qy 1 MAFHVEGLIAIIVFYLLILLVGIWAAWRTKNSGSAEERSEAIIVGGRDIGLLVGGFTMTA 60 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 1 MAFHVEGLIAI IVFYLLI LLVGIWAAWRTKNSGSAEERS EAI I VGGRDI GLLVGGFTMTA 60 

Qy 61 TWGGGYINGTAEAWVPGYGLAWAQAPIGYSLSLILGGLFFAKPMRSKGYVTMLDPFQQ 12 0 

I I I I I I I I I I II I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I II II I II I II I I I I I I I 
Db 61 TWVG GG Y I N GT AEAVYVP G YGLAWAQAP I G Y SLSLILGGLF FAK PMRS KG YVTMLD P FQQ 12 0 

Qy 121 I YGKRMGGLLFI PALMGEMFWAAAI FSALGATI SVI I DVDMHI SVI I SALIATLYTLVGG 180 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 121 I YGKRMGGLLFI PALMGEMFWAAAI FSALGATI SVI I DVDMHI SVI I SALIATLYTLVGG 180 

Qy 181 LYSVAYTDWQLFCIFVGLWISVPFALSHPAVADIGFTAVHAKYQKPWLGTVDSSEVYSW 24 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 181 LYSVAYTDWQLFCIFVGLWISVPFALSHPAVADIGFTAWAKYQKPWLGTVDSSEVYSW 240 

Qy 241 L D S FL L LML GG I PWQ AY FQ RVL S S S SAT YAQ VL S FLAAFGC L VMAI P AI L I GAI GAS T DW 300 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 1 I I I I I I I I I I I I I I I 
Db 241 LDS FLLLMLGGI PWQAYFQRVLS S S SAT YAQ VLS FLAAFGC L VMAI PAI LI GAIGASTDW 300 

Qy 301 NQTAYGLPDPKTTEEADMI LP I VLQYLCPVYI S FFGLGAVSAAVMS SADS S I LSAS SMFA 360 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I 
Db 301 NQTAYGLPDPKTTEEADMI LP IVLQYLCPWIS FFGLGAVSAAVMS SADS SI LSAS SMFA 360 

Qy 361 RN I YQLS FRQNAS DKEI VWVMRI TVFVFGASATAMALLTKTVYGLWYLS S DLVYI VI FPQ 420 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 361 RN I YQL S FRQNAS DKE I VWVMRI T VFVFGAS ATAMALLT KTVYGLWYL S S DLVYI VI FPQ 420 

Qy 421 LLCVLFVKGTNTYGAVAGYVSGLFLRITGGEPYLYLQPLIFYPGYYPDDNGIYNQKFPFK 480 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I 
Db 421 LLCVLFVKGTNTYGAVAGYVSGLFLRITGGEPYLYLQPLIFYPGYYPDDNGIYNQKFPFK 480 

Qy 481 T LAMVT SFLTNICISY LAKY LFESGTLPPKL D VF DAWARH S E ENMD KT I L VKN EN I K L D 540 

I I I I I I I I 1 1 1 1 I I I I I I I II I I I I I I I I I I 1 1 I I I II II I I I I I I I I II I I I I I I I M I 

Db 481 T LAMVT SFLTNICIS YLAK YL FE S GT L P P KLDVFDAWARH S EENMD KT I LVKN EN I KLD 540 

Qy 541 ELALVKPRQSMTLSSTFTNKEAFLDVDSSPEGSGTEDNLQ 580 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 541 ELALVKPRQSMTLSSTFTNKEAFLDVDSSPEGSGTEDNLQ 580 



RESULT 2 
Q9JMD7 

ID Q9JMD7 PRELIMINARY; PRT; 580 AA. 

AC Q9JMD7; 

DT 01-OCT-2000 (TrEMBLrel. 15, Created) 



DT 01-OCT-2000 (TrEMBLrel. 15, Last sequence update) 

DT 01-JUN-2003 (TrEMBLrel. 24, Last annotation update) 

DE High-affinity choline transporter CHT1. 

OS Rattus norvegicus (Rat) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 

OC Mammalia; Eutheria; Rodentia; Sciurognathi ; Muridae; Murinae; Rattus. 

OX NCBI_TaxID=10116; 

RN [1] 

RP SEQUENCE FROM N.A. 

RC STRAIN=Wistar; TISSUE=Spinal cord; 

RX MEDLINE-20116099; PubMed=10649566; 

RA Okuda T., Haga T., Kanai Y., Endou H., Ishihara T., Katsura I.; 

RT "Identification and characterization of the high-affinity choline 

RT transporter . " ; 

RL Nat. Neurosci. 3:120-125(2000). 

DR EMBL; AB030947; BAA90484.1; -. 

DR GO; GO: 0016020; C rmembrane; IEA. 

DR GO; GO:0005215; F: transporter activity; IEA. 

DR GO; GO: 0006810; P: transport; IEA. 

DR InterPro; IPR001734; Na/solut_symport . 

DR Pfam; PF00474; SSF; 1. 

DR PROSITE; PS50283; NA_SOLUT_SYMP_3 ; 1. 

SQ SEQUENCE 580 AA; 63406 MW; B7CB73323DAD17A7 CRC64; 

Query Match 94.9%; Score 2820; DB 11; Length 580; 

Best Local Similarity 93.1%; Pred. No. 4.3e-194; 

Matches 540; Conservative 24; Mismatches 16; Indels 0; Gaps 



Qy 


l 


MAFHVEGLIAIIVFYLLILLVGIWAAWRTKNSGSAEERSEAIIVGGRDIGLLVGGFTMTA 

! 1 1 1 1 1 1 1 : M 1 II : 1 II 1 1 1 i 1 1 1 1 1 1 1 1 1 M 1 1 1 

MPFHVEGLVAIILFYLLIFLVGIWAAWKTKNSGNAEERSEAIIVGGRDIGLLVGGFTMTA 


60 


Db 


l 


60 


Qy 


61 


TWGGGYINGTAEAVTVPGYGIAWAQAPIGYSLSLILGGLFFAKPMRSKGYVTMLDPFQQ 

| | | I I I 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 

TWVGGG Y I N GTAEAVYGPGCGLAWAQAP I G YS L S L I LGGL FFAKPMRS KGYVTMLD P FQQ 


120 


Db 


61 


120 


Qy 


121 


I YGKRMGGLLFI PALMGEMFWAAAI FSALGATI SVI I DVDMHI SVI I SAL I AT L YT LVGG 
| | | | | | | | | | | 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 II 1 1 : : 1 1 1 1 s 1 1 1 1 1 1 1 1 1 1 1 1 
I YGKRMGGLLFI PALMGEMFWAAAI FSALGATI SVI I DVDVN I SVIVSALIAILYTLVGG 


180 


Db 


121 


180 


Qy 
Db 


181 
181 


L Y S VAYT DWQ L FC I FVG LW I S VP FAL S H P AVAD I G FT AVHAK YQ K P WL GT VD S S E VY S W 

| | | | | | | | I I I I I I I 1 : 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 : 1 

LYSVAYTDWQLFCIFIGLWISVPFALSHPAVTDIGFTAVH7VKYQSPWLGTIESVEVYTW 


240 
240 


Qy 

Db 


241 
241 


L D S F L L LML GG I P WQ AY FQ RVL S S S S AT YAQ VL S FLAAF G C LVMAI P AI L I G AI GAS T D W 

|| : | | | | | 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 : 1 1 1 1 1 1 1 1 1 1 1 1 1 

LDNFLLLMLGGIPWQAYFQRVLSSSSATYAQVLSFLAAFGCLWALPAICIGAIGASTDW 


300 
300 


Qy 


301 


NQTAYGLPDPKTTEEADMI LPI VLQYLCPVYI S FFGLGAVSAAVMS SADSS I LSAS SMFA 
| | | | | | M | | | | | | | 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 
NQTAYGFPD PKT KEEADMI LP I VLQYLC PVYI S FFGLGAVSAAVMS S ADS S I LSAS SMFA 


360 


Db 


301 


360 


Qy 


361 


RNIYQLSFRQNASDKEIVWMRITVFVFGASATAMALLTKTVYGLWYLSSDLVYIVIFPQ 

| | 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 : 1 1 1 1 

RNIYQLSFRQNASDKEIVWVT4RITVFVFGASATAMALLTKTWGLWYLSSDLVYIIIFPQ 


420 


Db 


361 


420 


Qy 


421 


LLCVLFVKGTNTYGAVAGYVSGLFLRITGGEPYLYLQPLIFYPGYYPDDNGIYNQKFPFK 


480 



421 LLCVT,FIKGTNTYGAVAGYIFGLFLRITGGEPYLYLQPLIFYPGYYPDKNGIYNQRFPFK 480 



Qy 4 81 TLAMVTS FLTNI CI S YLAKYLFESGTLPPKLDVFDAWAKHSEENMDKTI LVKNENI KLD 540 

11:11111 I I I I : I I I I I I I I I I I I I I I I I I : I I I I I : I II I I I I I I I I I I : I I I I I I : 
Db 481 TLSMVTSFFTNICVSYLAKYLFESGTLPPKLDIFDAWSRHSEENMDKTILVRNENIKLN 540 

Qy 541 ELALVKPRQSMTLSSTFTNKEAFLDVDSSPEGSGTEDNLQ 580 

III I II I I I :| I I I I I I I I I I I I I I I I I I I I I I I I II I 

Db 541 ELAPVKPRQSLTLSSTFTNKEALLDVDSSPEGSGTEDNLQ 580 

RESULT 3 
Q8BGY9 

ID Q8BGY9 PRELIMINARY; PRT; 580 AA. 

AC Q8BGY9; 

DT 01-MAR-2003 (TrEMBLrel . 23, Created) 

DT 01-MAR-2003 (TrEMBLrel . 23, Last sequence update) 

DT 01-JUN-2003 (TrEMBLrel . 24, Last annotation update) 

DE Solute carrier family 5. 

GN SLC5A7 . 

OS Mus musculus (Mouse) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 

OC Mammalia; Eutheria; Rodentia; Sciurognathi ; Muridae; Murinae; Mus. 

OX NCBI_TaxID=10090; 

RN [1] 

RP SEQUENCE FROM N.A. 

RC STRAIN=C57BL/6J; TISSUE=Diencephalon, and Head; 

RX MEDLINE=22354683; PubMed=12466851; 

RA The FANTOM Consortium, 

RA the RIKEN Genome Exploration Research Group Phase I & II Team; 

RT "Analysis of the mouse transcriptome based on functional annotation of 

RT 60,770 full-length cDNAs . " ; 

RL Nature 420:563-573(2002). 

DR EMBL; AK034415; BAC28702.1; -. 

DR EMBL; AK053063; BAC35253.1; 

DR MGD; MGI: 1927126; Slc5a7 . 

DR GO; GO: 0016020; C:membrane; IEA. 

DR GO; GO: 0005215; F: transporter activity; IEA. 

DR GO; GO: 0006810; P: transport; IEA. 

DR InterPro; IPR001734; Na/solut_symport . 

DR Pfam; PF00474; SSF; 1. 

DR PROSITE; PS50283; NA_SOLUT_SYMP_3 ; 1. 

SQ SEQUENCE 580 AA; 63364 MW; 6154CE6622772A41 CRC64; 



Query Match 94.4%; Score 2806; DB 11; Length 580; 

Best Local Similarity 92.9%; Pred. No. 4.3e-193; 

Matches 539; Conservative 23; Mismatches 18; Indels 0; Gaps 0; 



Qy 1 MAFHVEGLIAIIVFYLLILLVGIWAAWRTKNSGSAEERSEAIIVGGRDIGLLVGGFTMTA 60 

I : I II II I : II I : I II I I I I II I II I : I I I I I : I I I I I I I I I I I I I I I I I I I I I II I I 
Db 1 MSFHVEGLVAIILFYLLIFLVGIWAAWKTKNSGNPEERSEAIIVGGRDIGLLVGGFTMTA 60 

Qy 61 TWGGGYINGTAEAVTVPGYGLAWAQAPIGYSLSLILGGLFFAKPMRSKGYVTMLDPFQQ 120 

I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I 

Db 61 TWVGGGYINGTAEAVYGPGCGLAWAQAPIGYSLSLILGGLFFAKPMRSKGYVTMLDPFQQ 120 



Qy 121 I YGKRMGGLLFI PALMGEMFWAAAI FSALGATI SVI I DVDMHI SVI I SALI ATLYTLVGG 180 

I I I I I I I I I I I I I I II II I I I I I I I I I I I I I I I I I I I I I I : : I I I I : I I I I I I I I I I I I 
Db 121 I YGKRMGGLLFI PALMGEMFWAAAI FSALGATI SVI I DVDVNI SVI VSALI AI LYTLVGG 180 

Qy 181 LYSVAYTDWQLFCIFVGLWISVPFALSHPAVADIGFTAVHAKYQKPWLGTVDSSEVYSW 240 

I I I I I I I I I I I I I I I I : I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I : I 

Db 181 L Y S VAYT DWQ L FC I FI G LW I S VP FAL S H P AVT DIG FT AVHAK YQ S PW L GT I E S VEVYTW 240 

Qy 241 L D S FL L LML GG I P WQ A Y FQ RVL S S S SAT YAQ VL S F LAAFGC L VMAI P AI L I GAI GAS T DW 300 

I I : II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I : I I I I I I I I I I I I I 

Db 241 L DN FL L LMLGG I PWQAYFQ RVL S S S SAT YAQ VL S FLAAFG C LVMAL PAI C I GAI GAS T DW 300 

Qy 301 NQTAYGL PDPKTTEEADMI LPI VLQYLCPVYI S FFGLGAVSAAVMSSADS S I LSAS SMFA 360 

II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 301 NQTAYGYPDPKTKEEADMI LPI VLQYLCPVYI S FFGLGAVSAAVMS SADS S I LSAS SMFA 360 

Qy 361 RN I YQLS FRQNAS DKEI VWVMRI TVFVFGASATAMALLTKTVYGLWYLS SDLVYI VI FPQ 42 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I : I I II 
Db 361 RN I YQLS FRQNAS DKEI VWVMRI TVLVFGASATAMALLTKTVYGLWYLS S DLVYI 1 1 FPQ 420 

Qy 421 LLCVLFVKGTNTYGAVAGYVSGLFLRITGGEPYLYLQPLIFYPGYYPDDNGIYNQKFPFK 480 

I I I I II : I I I I I I I I I I I I : I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I : I I I I 
Db 421 LLCVLFIKGTNTYGAVAGYIFGLFLRITGGEPYLYLQPLIFYPGYYSDKNGIYNQRFPFK 480 

Qy 481 TIAMVTSFLTNICISYLAKYLFESGTLPPKLDVFDAWARHSEENMDKTILVKNENIKLD 54 0 

I I : I II I I I I I I : I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I : I I I I I I : 

Db 481 TLSMWS FFTNI CVS YLAKYLFESGTLPPKLDVFDAVVARHS EENMDKTI LVRNENI KLN 54 0 

Qy 541 ELALVKPRQSMTLSSTFTNKEAFLDVDSSPEGSGTEDNLQ 580 

Ml I I I I I I : I I I I I I I I I I I I I II I I I I I I I I II I I I 
Db 541 ELAPVKPRQSLTLSSTFTNKEALLDVDSSPEGSGTEDNLQ 580 



RESULT 4 
Q99PK3 

ID Q99PK3 PRELIMINARY; PRT; 580 AA. 

AC Q99PK3; 

DT 01-JUN-2001 (TrEMBLrel. 17, Created) 

DT 01-JUN-2001 (TrEMBLrel. 17, Last sequence update) 

DT 01-JUN-2003 (TrEMBLrel. 24, Last annotation update) 

DE Sodium and chloride-dependent high-affinity choline transporter. 

GN SLC5A7 . 

OS Mus mus cuius (Mouse) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 

OC Mammalia; Eutheria; Rodentia; Sciurognathi ; Muridae; Murinae; Mus. 

OX NCBI_TaxID=10090; 

RN [1] 

RP SEQUENCE FROM N.A. 

RA Apparsundaram S., Ferguson S.M., Blakely R.D.; 

RT "Molecular cloning and characterization of human and murine high- 

RT affinity choline transporters."; 

RL Submitted (FEB-2001) to the EMBL/ GenBank/DDBJ databases. 

DR EMBL; AF276872; AAG36945.2; -. 

DR MGD; MGI: 1927126; Slc5a7 . 

DR GO; GO: 0016020; C:membrane; IEA. 

DR GO; GO: 0005215; F : transporter activity; IEA. 

DR GO; GO: 0006810; P: transport; IEA. 



DR InterPro; IPR001734; Na/solut_symport . 

DR Pfam; PF00474; SSF; 1. 

DR PROSITE; PS50283; NA_SOLUT_SYMP_3 ; 1. 

SQ SEQUENCE 580 AA; 63383 MW; DDBF58ED428270AF CRC64; 

Query Match 94.0%; Score 2795; DB 11; Length 580; 

Best Local Similarity 92.6%; Pred. No. 2.7e-192; 

Matches 537; Conservative 23; Mismatches 20; Indels 0; Gaps 0; 

MAFHVEGL I AI I VFYLLI LLVGI WAAWRT KNSGSAEERS EAI I VGGRDI GLLVGGFTMTA 6 0 

I | | I I I I : I I I : I I I I I II I I I I I! : I I I II : I I I I I I I I I I I I II I I I I I I I I I I I 

MPFHVEGLVAIILFYLLIFLVGIWAAWKTKNSGNPEERSEAIIVGGRDIGLLVGGFTMTA 60 

T WVG GG Y I N GT AEAVYVP G YGLAWAQAP I G YS L S L I LGGL FFAK PMRS KG YVTMLD P FQQ 120 

I I I I I I I I I I I I I I I I I I II I I I I I M I I M I I II I I I I I I I I II I I I I I : I 

TWVGGGYINGTAEAVYGPGCGLAWAHAPIGYSLSLILGGLFFAKPMRSKGYVTMLDPFKQ 120 



Qy 


]_ 


nh 


1 


Qy 


\j ± 


nh 


61 


C\\T 

yy 


1 ?1 

XL X 


nh 


121 


Qy 


1 1 


uu 


181 


Qy 


941 


Db 


241 


Pit x 

Qy 


■D U ± 


nh 
Liu 


301 


Qy 


361 


Db 


361 


Qy 


421 


Db 


421 


Qy 


481 


Db 


481 


Qy 


541 


Db 


541 



180 



| | I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I :: I I I I : I I I I I I I I I I I I 



I | I I I I I I I I I I I I I I : I I I I I I I I I I I I I I I I I I I I I I I I I I I : : I I I I : I 

L Y S VAYT DWQL FCI FI GLWI S VP FAL S H PAVT DIG FT AVHAKYQ S PWLGT I E S VEVYTW 240 

LDS FLLLMLGGI PWQAYFQRVLS S S SAT YAQVLS FLAAFGC LVMAI PAI LI GAI GAST DW 300 
I | : | | | | I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I : II I I I I I I I I I I I 



I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

NQTAYGYPDPKTKEEADMILPIVLQYLCPVYISFFGLGAVSAAVMSSADSSILSASSMFA 360 

RNIYQLSFRQNASDKEIVm^IRITVFVFGASATAMALLTKTVYGLWYLSSDLVYIVIFPQ 420 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 11111:1111 



LLCVLFVKGTNTYGAVAGYVSGLFLRITGGEPYLYLQPLIFYPGYYPDDNGIYNQKFPFK 4 80 

| | | | | | : | | | | I I I I I I I I : I I I I I I I I I I I I I I I I I I I I I I I I I I 111111:1111 
LLCVLFIKGTNTYGAVAGYIFGLFLRITGGEPYLYLQPLIFYPGYYSDKNGIYNQRFPFK 480 

TLAMVTSFLTNICISYLAKYLFESGTLPPKLDVFDAWARHSEENMDKTILVKNENIKLD 540 

||:|IMI I I I I : I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I : I I I I I I : 
TLSMVTSFFTNICVSYLAKYLFESGTLPPKLDVFDAWARHSEENMDKTILVRNENIKLN 54 0 

ELALVKPRQSMTLSSTFTNKEAFLDVDSSPEGSGTEDNLQ 580 

M I I I I I I I : I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

ELAPVKPRQSLTLSSTFTNKEALLDVDSSPEGSGTEDNLQ 580 



RESULT 5 
Q9ESW5 

ID Q9ESW5 PRELIMINARY; PRT; 58 0 AA. 

AC Q9ESW5; 

DT 01-MAR-2001 (TrEMBLrel. 16, Created) 

DT 01-MAR-2001 (TrEMBLrel. 16, Last sequence update) 

DT 01-JUN-2003 (TrEMBLrel. 24, Last annotation update) 



DE High affinity choline transporter. 

GN SLC5A7 OR CHT1. 

OS Mus musculus (Mouse) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 

OC Mammalia; Eutheria; Rodentia; Sciurognathi ; Muridae; Murinae; Mus. 

OX NCBI_TaxID=10090; 

RN [1] 

RP SEQUENCE FROM N.A. 

RC STRAIN=BALB/cJ; TISSUE=Brain stem; 

RA Bruess M. ; 

RL Submitted (AUG-2000) to the EMBL/ GenBank/DDBJ databases. 

RN [2] 

RP SEQUENCE FROM N.A. 

RC STRAIN=BALB/cJ; TISSUE=Brain stem; 

RA Wieland A., Bonisch H., Bruss M. ; 

RT "Molecular cloning of the human and murine high affinity choline 

RT transporters and characterization of the human gene-structure."; 

RL Submitted (AUG-2000) to the EMBL/ GenBank/ DDB J databases. 

DR EMBL; AJ401467; CAC03719.1; 

DR MGD; MGI: 1927126; Slc5a7 . 

DR GO; GO: 0016020; Crmembrane; IEA. 

DR GO; GO: 0005215; F : transporter activity; IEA. 

DR GO; GO:000681Q; P: transport; IEA. 

DR InterPro; IPR001734; Na/solut_symport . 

DR Pfam; PF00474; SSF; 1. 

DR PROSITE; PS50283; NA_SOLUT_SYMP_3 ; 1. 

SQ SEQUENCE 580 AA; 63331 MW; A4F1387CAA9EAAFE CRC64 ; 

Query Match 93.9%; Score 2791; DB 11; Length 580; 

Best Local Similarity 92.4%; Pred. No. 5.1e-192; 

Matches 536; Conservative 24; Mismatches 20; Indels 0; Gaps 0; 
Qy 1 MAFHVEGLIAI I VFYLLI LLVGI WAAWRTKNS GSAEERS EAI I VGGRDI GLLVGGFTMTA 60 






Db 



1 MS FHVEGLVAI I L F YLL I FLVGI WAAWKT KN S GN P EEH S EAI I VGGRD I GLLVGGFTMTA 60 



Qy 



61 TWVGGGYINGTAEAVYVPGYGLAWAQAPIGYSLSLILGGLFFAKPMRSKGYVTMLDPFQQ 120 




Db 



61 TWVGGGYINGTAVAVYGPGCGLAWAQAPIGYSLSLILGGLFFAKPMRSKGYVTMLDPFQQ 120 



Qy 



Db 



121 IYGKRMGGLLFIPALMGEMFWAAAIFSALGATISVIIDVDMHISVIIS7VLIATLYTLVGG 180 

I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I :: I I I I : I I I II I II I I I I 

121 I YGKRMGGLLFI PALMGEMFWAAAI FSALGAT I S VI I DVDVN I S VI VSALI AI LYTLVGG 180 



181 LYSVAYTDWQLFCIFVGLWISVPFAiSHPAVADIGFTAVHAKYQKPWLGTVDSSEVYSW 240 




Db 



181 LYS VAYT DWQL FC I FI GLW I S VP FAL S H P AVT DIG FT AVHAKYQ S P WLGT I E S VEVYTW 240 



Qy 



Db 



241 LDS FLLLMLGGI PWQAYFQRVLS S S SAT YAQVLS FLAAFGCLVMAI PAI LI GAI GASTDW 300 

II : I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I : I I I I I I I I I I : I I I II I I I I I I I I 
241 LDN F L L LML GG I P WQ AY FQ RVL S S S SAT Y AQ VL S YLAAFGC LVMAL PAI C I GAI GAS T DW 300 



Qy 



301 NQTAYGLPDPKTTEEADMI LPI VLQYLCPVYI S FFGLGAVSAAVMS SADS S ILSAS SMFA 360 




Db 



301 NQTAYGYPDPKTKEE1ADMILPIVLQYLCPVYIS FFGLGAVSAAVMS SADS S ILSAS SMFA 360 



Qy 



361 RNIYQLSFRQNASDKEIWWIRITVFVFGASATAMTVLLTKTvTGLWYLSSDLVYIVIFPQ 420 



I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I M I I I II I : I I I I 
Db 361 RNIYQLSFRQNASDKEIVWVMRITVLVFGASATAMALLTKTVYGLWYLSSDLVYIIIFPQ 420 

Qy 421 LLCVLFVKGTNTYGAVAGWSGLFLRITGGEPYLYLQPLIFYPGYYPDDNGIYNQKFPFK 480 

I | | | I I : | | | | | | | I I I I I : I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I : I I I I 

Db 421 LLCVLFIKGTNTYGAVAGYIFGLFLRITGGEPYLYLQPLIFYPGYYSDKNGIYNQRFPFK 480 

Qy 4 81 TLAMWSFLTNICISYLAKYLFESGTLPPKLDVFDAVVARHSEENMDKTILVKNENIKLD 54 0 

11:11111 I I I I : I I I I I I I I I I I I I ! I I I I I I I I I I I M I I M I I I I ! I I : I I I I I I : 
Db 481 T L SMVT S FFTN I CVS YLAK YL FE S GT L P P KL DVFDAWARH S EENMD KT I LVRNEN I KLN 540 

Qy 541 ELALVKPRQSMTLSSTFTNKEAFLDVDSSPEGSGTEDNLQ 580 

III I I I I I I : I I I I I I I I I I I I I I I I I I I I I I I I I I II 
Db 541 ELAPVKPRQSLTLSSTFTNKEALLDVDSSPEGSGTEDNLQ 580 



RESULT 6 
Q8UWF0 

ID Q8UWF0 PRELIMINARY; PRT; 584 AA. 

AC Q8UWF0; 

DT 01-MAR-2002 (TrEMBLrel. 20, Created) 

DT 01-MAR-2002 (TrEMBLrel. 20, Last sequence update) 

DT 01-JUN-2003 (TrEMBLrel. 24, Last annotation update) 

DE High affinity choline transporter. 

GN CHT1. 

OS Torpedo marmorata (Marbled electric ray) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Chondrichthyes ; 

OC Elasmobranchii; Squalea; Hypnosqualea ; Pris tiora j ea ; Batoidea; 

OC Torpedini formes; Torpedinoidei ; Torpedinidae ; Torpedo. 

OX NCBI_TaxID=7788; 

RN [1] 

RP SEQUENCE FROM N.A. 

RC TISSUE=Electric lobe; 

RA Guermonprez L., O'Regan S., Meunier F.M., Morot-Gaudry-Talarmain Y. ; 

RT "Cyclosporin, FK506 and rapamycin inhibit neuronal choline uptake via 

RT calcineurin-dependent and independent mechanisms."; 

RL Submitted (NOV-2001) to the EMBL/ GenBank/ DDB J databases. 

DR EMBL; AJ420808; CAD12727.1; 

DR GO; GO: 0016020; C:membrane; IEA. 

DR GO; GO: 0005215; F: transporter activity; IEA. 

DR GO; GO: 0006810; P: transport; IEA. 

DR InterPro; IPR001734; Na/solut_symport . 

DR Pfam; PF00474; SSF; 1. 

DR PROSITE; PS50283; NA_SOLUT_SYMP_3 ; 1. 

SQ SEQUENCE 584 AA; 63660 MW; 995F937B01195A3D CRC64; 

Query Match 75.8%; Score 2253; DB 13; Length 584; 

Best Local Similarity 72.4%; Pred. No. 2.1e-153; 

Matches 418; Conservative 83; Mismatches 70; Indels 6; Gaps 2; 

Qy 1 MAFHVEGLIAIIVFYLLILLVGIWAAWRTKNSG — SAEERSEAIIVGGRDIGLLVGGFTM 58 

I | : : | : : I I : : I I I I I I I I : I I I I : : I I : I : I I I I I : : I I I I I I I I I I I I I I 

Db 1 MTVH I DGI VAI VLFYLL I LFVGLWAAWKS KNT SMEGAMDRS EAIMI GGRDI GLLVGGFTM 60 



QY 
Db 



59 T ATWVGGG Y I N GT AEAVYVP G YGLAWAQ AP IGYSLSLI L GGL FFAKPMRS KG YVTMLD P F 118 

I M I II II I I I I I II I I I I I I I I I I I I I I I I I : I I I I I I I I I I I I I I : I I I I I I I I I 

61 TATWVGGGYINGTAEAVYVPGYGLAWAQAPFGYALSLVIGGLFFAKPMRSRGYVTMLDPF 12 0 



Qy 

Db 



119 QQI YGKRMGGLLFI PALMGEMFWAAAI FSALGATI SVI I DVDMHI SVI I SALIATLYTLV 178 

I I : I I I I I I I I I I I I I I : I I : I I : I 1. 1 I I I I I I : I I I : I : : : : : I I : : I I : I I I II I I 
121 QQMYGKRMGGLLFIPALLGEIFWSAAILSALGATLSVIVDININVSVWSAVIAVLYTLV 180 



Qy 179 GGL YS VAYT DWQ L FC I FVGLW I S VP FAL S H P AVAD I G FT AVHAK YQ KPWLGT VD S S EVY 238 

I I I I I I I I I I I I I I I I I I : I I I I I : I I I I : I I I I I II I I : I I : I : I : 
Db 181 GGLYSVAYTDWQLFCIFLGLWISIPFALLNPAVTDIIVTANQEVYQEPWVGNIQSKDSL 240 

Qy 239 SWLDS FLLLMLGGI PWQAYFQ RVLS S SSATYAQ VLS FLAAFGCLVMAI PAILI GAIGAST 298 

I : I : I I I I I I I I I I I I I I I I I I I : I I I I I I I I I I I I I I I II : : I I II : : I I I I I I II 
Db 241 IWI DN FLLLMLGGI PWQVYFQRVLSAS SATYAQVLS FLAAFGCVLMAI PSVLI GAIGTST 300 

Qy 299 DWNQTAYGLPDPKTTEEADMI LP I VLQYLCPWI SFFGLGAVSAAVMS SADS S I LSAS SM 358 

I I I I I : I I I I I I II I I I I I I I : I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 301 DWNQTS YGLPGP I GKNETDMI LP I VLQHLCPP YI S FFGLGAVSAAVMS SADSS I LSAS SM 360 

Qy 359 FARNIYQLSFRQNASDKEIVWVMRITVFVFGASATAMALLTKTVYGLWYLSSDLVYIVIF 418 

II I I I I I : I I I I I I I I I I I I I I I I : I : I I : I I : I I I I : : : I I I I I I I I I I I I : : I I 

Db 361 FARN I YHLAFRQEAS DKE I VWVMRI T I FLFGGAAT SMALLAQ S I YGLW YLS S DLVYVI I F 420 

Qy 419 PQLLCVLFVKGTNTYGAVAGYVSGLFLRITGGEPYLYLQPLIFYPGYYPD DNGIYN 474 

III: I II I I I I I I I I :: I I I : I I I I : I I I I I I : : I I hill I I I : : I 

Db 421 PQLI S VLFVKGTNT YGS IAGYI IGFLLRI SGGEPYLHMQPFI YYPGCYLDHS FGDDPVYV 480 

Qy 475 Q KFP FKT LAMVT S FLTN I C I S Y LAK YL FE S GT LP P KLDVFDAWARH S EENMD KT I LVKN 534 

I : I I I I I : I I : III I : I I I I I I I II I I I I I I I : I I : : I I : I I I I I I : 
Db 4 81 QRFPFKTMAMLFSFLGNTGVSYLVKYLFVSGILPPKLDFLDSWSKHSKEIMDKTFLMNQ 540 

Qy 535 ENIKLDELALVKPRQSMTLSSTFTNKEAFLDVDSSPE 571 

:|| I II II I ::|: MINI I:: :|l 
Db 541 DNITLSELVHVNPIHSASVSAALTNKEAFEDIEPNPE 577 



RESULT 7 
Q8AV27 

ID Q8AV27 PRELIMINARY; PRT; 377 AA. 

AC Q8AV27; 

DT 01-MAR-2003 (TrEMBLrel. 23, Created) 

DT 01-MAR-2003 (TrEMBLrel. 23, Last sequence update) 

DT 01-OCT-2003 (TrEMBLrel. 25, Last annotation update) 

DE Putative high affinity choline transporter 1 (Fragment) . 

GN CHT1 . 

OS Gallus gallus (Chicken) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 

OC Archosauria; Aves; Neognathae; Gallif ormes ; Phasianidae; Phasianinae; 

OC Gallus. 

OX NCBI_TaxID=9031; 

RN [1] 

RP SEQUENCE FROM N.A. 

RC TISSUE=Ciliary ganglion; 

RX MEDLINE=22308883; PubMed-12421710 ; 

RA Mueller F., Rohrer H. ; 

RT "Molecular control of ciliary neuron development: BMPs and downstream 

RT transcriptional control in the parasympathetic lineage."; 

RL Development 129:5707-5717(2002). 

DR EMBL; AJ511267; CAD53475.1; -. 



DR GO; GO: 0016020; C:membrane; IEA. 

DR GO; GO: 0005215; F: transporter activity; IEA. 

DR GO; GO:0006810; P:transport; IEA. 

DR InterPro; IPR001734; Na/solut_symport . 

DR Pfam; PF00474; SSF; 1. 

DR PROSITE; PS50283; NA_SOLUT_SYMP_3 ; 1. 

FT NON_TER 1 1 

FT NONJTER 377 377 

SQ SEQUENCE 377 AA; 41070 MW; 995293969378F8E7 CRC64; 

Query Match 56.5%; Score 1679; DB 13; Length 377; 

Best Local Similarity 85.4%; Pred. No. 1.9e-112; 

Matches 322; Conservative 28; Mismatches 27; Indels 0; Gaps 0; 

Qy 144 AIFSALGATISVIIDVDMHISVIISALIATLYTLVGGLYSVAYTDWQLFCIFVGLWISV 203 

I I I I I I I I I I I I I I ::::: I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I : I I I I I I 
Db 1 AI FSALGAT I S VITDINVNLSVI I SALI ATLYTLVGGLYSVAYTDVVQLFCI FLGLWI SV 60 

Qy 204 PFALSHPAV7U)IGFTAVIiAKYQKPWLGTVDSSEVYSWLDSFLLLMLGGIPWQAYFQRVLS 263 

11111:111 I I I I I II I : I Mill: I : I : I I I : I I I I I I I I I I I I I I I I I I 

Db 61 PFALSNPAVTDIGFTAVHEVHQAPWLGTIGSLNIYTWLDNFLLLTFGGIPWQAYFQRVLS 120 

Qy . ■ 264 SSSATYAQVLSFLAAFGCLVMAIPAILIGAIGASTDWNQTAYGLPDPKTTEEADMILPIV 323 

I I I I I I I I I I I I I I I I I I : I I I I I I : I I I I I I I I I I I I I Ihlll : I I I I I I I I I 

Db 121 S S S AT YAQVL S FLAAFG C I VMAI PAVL I GAI GAS T AWNQT E YGVP D P I AN K EADMI L P I V 180 

Qy 324 LQ YLCPVYI S FFGLGAVSAAVMS SADS S I LSAS SMFARN I YQLS FRQNAS DKEI VWVMRI 383 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I : I I I I I I I I 
Db 181 LQYLC PVYI S FFGLGAVSAAVMS SADS S I LSAS SMFARN I YQLS FRQNAS D RE I VWVMRI 240 

Qy 384 TVFVFGASATAMALLTKTVYGLWYLSSDLVYIVIFPQLLCVLFVKGTNTYGAVAGYVSGL 443 

I I I : I I I I I I I I I I I : I I I I I I I I I I I I I I : I I I I I I I I I I : I I I I I I I I : I I I : I I 
Db 241 TVFLFGASATAMALLAS SVYGLWYLS SDLVYI 1 1 FPQLLCVLFI KGTNTYGAI AGYLFGL 300 

Qy 444 FLRITGGEPYLYLQPLIFYPGYYPDDNGIYNQKFPFKTLAMVTSFLTNICISYLAKYLFE 503 

I I I I I I I I I I I I I I I I : I I I I I I : I II I : I I I I I I I I : I I I III : I I I I I I I I 
Db 301 VLRITGGEPYLYLQPLIYYPGCYPDENNIYVQRFPFKTLAMLTSFFTNIIVSYLAKYLFG 360 

Qy 504 SGTLPPKLDVFDAWAR 520 

I I I II I I II I I I I I I 
Db 361 SGTLPPKLDFLDAWAR 377 



RESULT 8 
Q9VE4 6 

ID Q9VE4 6 PRELIMINARY; PRT; 614 AA. 

AC Q9VE46; Q961W3; 

DT 01-MAY-2000 (TrEMBLrel. 13, Created) 

DT 01-OCT-2002 (TrEMBLrel. 22, Last sequence update) 

DT 01-JUN-2003 (TrEMBLrel. 24, Last annotation update) 

DE CG7708 protein (GH02984p) . 

GN CG7708. 

OS Drosophila melanogaster (Fruit fly) . 

OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; 

OC Neoptera; Endopterygota ; Diptera; Brachycera; Muscomorpha; 

OC Ephydroidea; Drosophilidae; Drosophila. 

OX NCBI TaxID=7227; 



RN [1] 

RP SEQUENCE FROM N.A. 

RC STRAIN=Berkeley; 

RX MEDLINE=20196006; PubMed=10731132 ; 

RA Adams M.D., Celniker S.E., Holt R.A. , Evans C.A., Gocayne J.D., 

RA Araanatides P.G., Scherer S.E., Li P.W., Hoskins R.A. , Galle R.F., 

RA George R.A. , Lewis S.E., Richards S., Ashburner M. , Henderson S.N., 

RA Sutton G.G., Wortman J.R., Yandell M.D., Zhang Q. , Chen L.X., 

RA Brandon R.C., Rogers Y.-H.C, Blazej R.G., Champe M. , Pfeiffer B.D., 

RA Wan K.H., Doyle C, Baxter E.G., Helt G. , Nelson C.R., Miklos G.L.G., 

RA Abril J.F., Agbayani A., An H.-J., Andrews-Pf annkoch C, Baldwin D., 

RA Ballew R.M., Basu A., Baxendale J., Bayraktaroglu L. , Beasley E.M., 

RA Beeson K. Y. , Benos P.V. , Berman B.P., Bhandari D., Bolshakov S., 

RA Borkova D., Botchan M.R., Bouck J., Brokstein P., Brottier P., 

RA Burtis K.C., Busam D . A. , Butler H., Cadieu E., Center A., Chandra I., 

RA Cherry J.M., Cawley S., Dahlke C, Davenport L.B., Davies P., 

RA de Pablos B., Delcher A. , Deng Z., Mays A.D., Dew I., Dietz S.M., 

RA Dodson K. , Doup L.E., Downes M. , Dugan-Rocha S., Dunkov B.C., Dunn P., 

RA Durbin K.J., Evangelista C.C., Ferraz C, Ferriera S., Fleischmann W. , 

RA Fosler C, Gabrielian A.E., Garg N.S., Gelbart W.M. , Glasser K., 

RA Glodek A., Gong F., Gorrell J.H., Gu Z., Guan P., Harris M. , 

RA Harris N.L., Harvey D., Heiman T.J., Hernandez J.R., Houck J. , 

RA Hostin D., Houston K.A. , Howland T.J. , Wei M.-H., Ibegwam C, 

RA Jalali M. , Kalush F. , Karpen G.H., Ke Z., Kennison J. A., Ketchum K.A. , 

RA Kimmel B.E., Kodira CD., Kraft C, Kravitz S., Kulp D . , Lai Z., 

RA Lasko P., Lei Y., Levitsky A. A. , Li J., Li Z., Liang Y. , Lin X., 

RA Liu X., Mattei B., Mcintosh T.C., McLeod M.P., McPherson D . , 

RA Merkulov G., Milshina N.V. , Mobarry C, Morris J., Moshrefi A., 

RA Mount S.M., Moy M. , Murphy B., Murphy L., Muzny D.M., Nelson D.L., 

RA Nelson D.R., Nelson K.A., Nixon K. , Nusskern D.R., Pacleb J.M., 

RA Palazzolo M. , Pittman G.S., Pan S. f Pollard J . , Puri V., Reese M.G., 

RA Reinert K. , Remington K., Saunders R.D.C., Scheeler F., Shen H., 

RA Shue B.C., Siden-Kiamos I., Simpson M., Skupski M.P., Smith T . , 

RA Spier E., Spradling A.C., Stapleton M., Strong R. , Sun E., 

RA Svirskas R., Tector C, Turner R., Venter E., Wang A.H., Wang X., 

RA Wang Z.-Y., Wassarman D.A., Weinstock G.M., Weissenbach J., 

RA Williams S.M., Woodage T., Worley K.C., Wu D. , Yang S., Yao Q.A. , 

RA Ye J., Yeh R.-F., Zaveri J.S., Zhan M., Zhang G., Zhao Q. , Zheng L., 

RA Zheng X.H., Zhong F.N., Zhong W., Zhou X., Zhu S., Zhu X., Smith H.O., 

RA Gibbs R.A., Myers E.W., Rubin G.M., Venter J.C.; 

RT "The genome sequence of Drosophila melanogaster . " ; 

RL Science 287:2185-2195(2000). 

RN [2] 

RP SEQUENCE FROM N.A. 

RA Celniker S.E., Adams M.D., Kronmiller B., Wan K.H., Holt R.A. , 

RA Evans C.A., Gocayne J.D., Amanatides P.G., Brandon R.C., Rogers Y., 

RA Banzon J., An H., Baldwin D., Banzon J., Beeson K.Y., Busam D.A., 

RA Carlson J.W., Center A., Champe M. , Davenport L.B., Dietz S.M., 

RA Dodson K. , Dorsett V., Doup L.E., Doyle C, Dresnek D . , Farfan D., 

RA Ferriera S., Frise E. , Galle R.F., Garg N.S., George R.A., 

RA Gonzalez M. , Houck J., Hoskins R.A. , Hostin D., Howland T.J., 

RA Ibegwam C, Jalali M. , Kruse D., Li P., Mattei B., Moshrefi A., 

RA Mcintosh T.C., Moy M. , Murphy B. , Nelson C, Nelson K.A. , Nunoo J., 

RA Pacleb J., Paragas V., Park S., Patel S., Pfeiffer B., 

RA Phouanenavong S., Pittman G.S., Puri V., Richards S., Scheeler F. , 

RA Stapleton M., Strong R., Svirskas R., Tector C, Tyler D., 

RA Williams S.M., Zaveri J.S., Smith H.O., Venter J.C., Rubin G.M.; 



RT "Sequencing of Drosophila melanogaster genome."; 

RL Submitted (MAR-2 000) to the EMBL/ GenBank/DDBJ databases. 

RN [3] 

RP SEQUENCE FROM N.A. 

RA Misra S., Crosby M.A., Matthews B.B., Bayraktaroglu L., Campbell K., 

RA Hradecky P., Huang Y., Kaminker J.S., Prochnik S.E., Smith CD., 

RA Tupy J.L., Bergman C, Berman B., Carlson J.W., Celniker S.E., 

RA Clamp M. , Drysdale R., Emmert D., Frise E. f de Grey A., Harris N. f 

RA Kronmiller B., Marshall B., Millburn G. , Richter J . , Russo S., 

RA Searle S.M.J., Smith E. , Shu S., Smutniak F. , Whitfield E., 

RA Ashburner M. , Gelbart W.M., Rubin G.M., Mungall C.J., Lewis S.E.; 

RT "Annotation of Drosophila melanogaster genome."; 

RL Submitted (MAR-2000) to the EMBL/ GenBank/DDBJ databases. 

RN [4] 

RP SEQUENCE FROM N.A. 

RA Adams M.D., Celniker S.E., Gibbs R.A. , Rubin G.M., Venter C.J.; 

RL Submitted (MAR-2000) to the EMBL/ GenBank/DDBJ databases. 

RN [5] 

RP SEQUENCE FROM N.A. 

RA FlyBase; 

RL Submitted (SEP-2002) to the EMBL/ GenBank/DDBJ databases. 

RN [6] 

RP SEQUENCE FROM N.A. 

RC STRAIN=Berkeley; 

RA Stapleton M. , Brokstein P . , Hong L., Agbayani A., Carlson J., 

RA Champe M. , Chavez C. , Dorsett V. , Farfan D., Frise E . , George R. , 

RA Gonzalez M. , Guarin H., Li P., Liao G. , Miranda A., Mungall C.J., 

RA Nunoo J., Pacleb J., Paragas V. > Park S., Phouanenavong S., Wan K. , 

RA Yu C, Lewis S.E., Rubin G.M., Celniker S.; 

RL Submitted (JUL-2001) to the EMBL/ GenBank/DDBJ databases. 

DR EMBL; AE003723; AAF55583.2; -. 

DR EMBL; AY047521; AAK77253.1; -. 

DR FlyBase; FBgn0038641; CG7708. 

DR GO; GO: 0016020; C:membrane; IEA. 

DR GO; GO: 0005215; F: transporter activity; IEA. 

DR GO; GO: 0006810; P: transport; IEA. 

DR InterPro; IPR001734; Na/solut_symport . 

DR Pfam; PF00474; SSF; 1. 

DR PROSITE; PS50283; NA_SOLUT_SYMP_3 ; 1. 

SQ SEQUENCE 614 AA; 66893 MW; 7 1A77E1216360042 CRC64; 

Query Match 52.4%; Score 1557.5; DB 5; Length 614; 

Best Local Similarity 58.5%; Pred. No. 1.8e-103; 

Matches 302; Conservative 84; Mismatches 115; Indels 15; Gaps 6; 

Qy 4 HVEGLIAI I VFYLLI LLVGIWAAWRTKNSGSAEERS EAI IVGGRDIGLLVGGFTMTATWV 63 

: : I : : : I : : I I I I I I : I II I I Mil: I I ::: I I I I I I I I I I I I I I I 

Db 3 NIAGWSIVLFYLLILWGIWAG-RKKQSGNDSE— EEVMLAGRSIGLFVGIFTMTATWV 59 

Qy 64 GGGYINGTAFLAVYVPGYGLAWAQAP I GYS LS LI LGGLFFAKPMRS KGYVTMLDP FQQI YG 123 

|||||llllll:| I I I I I I I I : I I I : I I : I I I I I I s I I : I I I I I I : ' 

Db 60 GGGY I N GT AEAI YT S — GLVWCQAP FG YAL S LVFGGI FFAN PMRKQG YI TMLD P LQD S FG 117 



Qy 

Db 



124 KRMGGLLFI PALMGEMFWAAAI FSALGATI SVI I DVDMHI SVI I SALI ATLYTLVGGLYS 183 

:|||||||:||| Ihllll I : II I I I : I I I I I : I llhl: M III Mill 
118 ERMGGLLFLPALCGEVFWAAGILAALGATLSVIIDMDHRTSVILSSCIAIFYTLFGGLYS 177 



Qy 184 VAYTDWQLFCIFVGLWISVPF7VLSHPAVADIGFTAVHAKYQKPWLGTVDSSEVYSWLDS 243 

I I II M : M I M I : I I I : : M I I : I : : : I : I I : : : : : I 

Db 178 VAYTDVIQLFCIFIGLWMCIPFAWSNEHVGSL SDLEVDWI GHVEPKKHWLYI DY 231 

Qy 244 FLLLMLGGI PWQAYFQRVLS S S SATYAQVLSFLAAFGCLVMAI PAI LIGAIGASTDWNQT 303 

I II : I I I I I I I I I I I I I I : I I I : I I : : I I I I : : I I I I : I I I I I : I I I : I 
Db 232 GLLLVFGGIPWQVYFQRVLSSKTAGRAQLLSYVAAAGCILMAIPPVLIGAIAKATPWNET 291 

Qy 304 AYGLPDPKTTEEADMILPIVLQYLCPVYISFFGLGAVSAAVMSSADSSILSASSMFARNI 363 

I 111:1 I I II : I I I I I I : : I I I I I I I I I I I I I I I I I I I : I II : I I I I I I : 

Db 292 DYKGPYPLTVDETSMILPMVLQYLTPDFVSFFGLGAVSAAVMSSADSSVLSAASMFARNV 351 

Qy 364 YQLSFRQNASDKEIVWVMRITVFVFGASATAMALLTKTVYGLWYLSSDLVYIVIFPQLLC 423 

I : I I I I I I : I I : I I I I :: I I I I I I I : : I I I I : I I I I I : : : I I I I I 
Db 352 YKLIFRQKASEMEIIWVMRVAIIWGII^TIMALTIPSIYGLWSMCSDLVYVILFPQLI^ 411 

Qy 42 4 VL-FVKGTNTYGAVAGYVSGLFLRITGGEPYLYLQPLIFYPGYYPDDNGIYNQKFPFKTL 4 82 

I : I I II I I : :: I : I : I : : I I I I I I I I I I I I I I I I I : I : 

Db 412 WHFKKHCNTYGSLSAYIVALAIRLSGGEAILGLAPLIKYPGY DEETKEQMFPFRTM 468 

Qy 483 AMVTSFLTNICISYLAKYLFESGTLPPKLDVFDAW 518 

I I : I : I I : I : I : I I I I I I I II I I 
Db 469 AMLLSLVTLISVSWWTKMMFESGKLPPSYDYFRCW 504 



RESULT 9 
Q9GPB1 

ID Q9GPB1 PRELIMINARY; PRT; 579 AA. 

AC Q9GPB1; 

DT 01-MAR-2001 (TrEMBLrel . 16, Created) 

DT 01-MAR-20O1 (TrEMBLrel. 16, Last sequence update) 

DT 01-JUN-2003 (TrEMBLrel. 24, Last annotation update) 

DE Choline cotransporter . 

OS Limulus polyphemus (Atlantic horseshoe crab) . 

OC Eukaryota; Metazoa; Arthropoda; Chelicerata; Merostomata; Xiphosura; 

OC Limulidae; Limulus. 

OX NCBI_TaxID=6850; 

RN [1] 

RP SEQUENCE FROM N . A. 

RX MEDLINE=21261948; PubMed^l 1368 908 ; 

RA Wang Y., Cao Z., Newkirk R.F., Ivy M.T., Townsel J.G.; 

RT "Molecular, cloning of a cDNA for a putative choline co-transporter 

RT from Limulus CNS."; 

RL Gene 268:123-131(2001). 

DR EMBL; AY011119; AAG41055.1; -. 

DR GO; GO: 0016020; C:membrane; IEA. 

DR GO; GO: 0005215; F: transporter activity; IEA. 

DR GO; GO: 0006810; P: transport; IEA. 

DR InterPro; IPR001734; Na/solut_symport . 

DR Pfam; PF00474; SSF; 1. 

DR PROSITE; PS50283; NA_SOLUT_SYMP_3 ; 1. 

SQ SEQUENCE 579 AA; 62937 MW; FE7F29D4FAF47F04 CRC64; 

Query Match 51.5%; Score 1530.5; DB 5; Length 579; 

Best Local Similarity 52.0%; Pred. No. 1.5e-101; 

Matches 305; Conservative 115; Mismatches 134; Indels 33; Gaps 11; 



Qy 1 MAFHVEGLIAIIVFYLLILLVGIWAAWRTKNSGSAEERSEAIIVGGRDIGLLVGGFTMTA 60 

I I : : I : : : I : I I : : I I : I I I I I : I I : I : : I I : : I I : I I I I I I I I I 

D b 1 MAVN I LGWS I GI FYVI I LI VGIWAS - RKKKT S S GQS ETEEIMLAGRNI GFLVGVLTMTA 59 

Qy 61 TWGGGYINGTAEAVYVPGYGLAWAQAPIGYSLSLILGGLFFAKPMRSKGYVTMLDPFQQ 120 

I I I I I I I I I I I I I I : I II I III 11:111 : I I : III II I : I I I I I I I I I : 
D b 60 TWVGGGYINGTAEAMY — NNGLVWCQAPFGYALSLFIGGIVFAKKMRSQGYVTMLDPLQE 117 

Qy 121 I YGKRMGGLLFI PALMGEMFWAAAI FSALGAT I SVI I DVDMHI SVI I SALI ATLYTLVGG 180 

: I : II I I I I : I I I II : I I : I I I : I I I I I I I I I = = : I : I : I : II M I I 
Db 118 NFGSKMGGLLFLPALCGEI FWSAAI LAALGATI SVITELES STS I 1 VS S S I AVFYTFFGG 177 

Qy 181 LYSVAYTDWQLFCIFVGLWISVPFALSHPAVADIGFTAVHAKYQKPWLGTVDSSEVYSW 240 

I | I I I I I I : I I I I I I I I I : : I I : I I I I : : I I : I ' : 

D b 178 FYSVAYTDVIQLFCI FFGLWLCI PFS FSHEAVGSLS SIDFLGSVKLSDAGIN 229 

Qy 241 L D S F L L LML G G I P WQ AY FQ RVL S S S SAT YAQ VL S F LAAF G C L VMAI P AI L I GAI GAS T D W 300 

: I : | | | : I I I II I I I I I I I I : : : I I I I I :: I I I I : I I I I I I I I I I I : I I 
Db 230 VD I WLL L I FGG I PWQVY FQ RVL S AKN VS NAQVL S YVAAVGCWMAI P AI L I GVI AKAT AW 289 

Qy 301 NQTAYGLPDPKTTEEADMILPIVLQYLCPVYISFFGLGAVSAAVMSSADSSILSASSMFA 360 

| : | | | : | | : : : I I : I I II I : I I I I I I I I I I I I I I I : I I I I I I I I I : I : 
Db 290 NET ALGM- - P LT PNDTS LVL P LVLH YLT PT AVS FFGLGAVS AAVMS S S D S S I L S AS S L FS 347 

Qy 361 RNI YQLS FRQNAS DKEI VWVMRI TVFVFGAS ATAMALLTKTVYGLWYLS S DLVYI VI FPQ 420 

M : I : I I I I II : : I : I I I : I I : : I I I I I I I I I : I I I I I : I : : : I I I 

Db 348 RNVYKLI FRQKAS EREWWVI RI S I LWGI LATAMALTVKS VYGLW YLS S DL I YVI LFPQ 407 

Qy 421 LLCVLFVKG-TNTYGAVAGYVSGLFLRITGGEPYLYLQPLIFYPGYYPDDNGIYNQKFPF 479 

||||: : | M I I : : : I = I II I I I I 1-1 = 1 I I : : : : I Mill 
Db 408 LLCWHLKKYCNTYGSLSAYIVGFLLRALGGESILGLEPVIHYP-FFSETSG— QRFPF 4 63 

Qy 480 KT LAMVT S FLTN I C I S YLAK YL FE S GT L P P KLDVFD AWARH S E ENMDKT I LVKN EN I K- 538 

: M M : I M : II : I : : I I II I I M I I : : I I : : I : : : 

Db 464 RTLSMLASLITLLAISGITKWIFEMNHLPAKLDIFRCVT — NIQEN IIKIQKLQG 516 

Qy 539 LDEL- -ALVKPRQSMTLS ST FTNKEAFLDVD S S PEGS GT EDN 578 

||: : : : : : : : II II : I : : I 
Db 517 GAMPVLDS I KKEI YQKDMNNS FNTWNSGNAELLTDSTYSGKI KKNN 563 



RESULT 10 
002228 

ID 002228 PRELIMINARY; PRT; 576 AA. 

AC 002228; Q9NL58; 

DT 01-JUL-1997 (TrEMBLrel. 04, Created) 

DT 01-OCT-2001 (TrEMBLrel. 18, Last sequence update) 

DT 01-OCT-2003 (TrEMBLrel. 25, Last annotation update) 

DE C48D1.3 protein (High-affinity choline transporter CHO-1) . 

GN C4 8D1.3 OR CHO-1. 

OS Caenorhabditis elegans . 

OC Eukaryota; Metazoa; Nematoda; Chromadorea; Rhabditida; Rhabditoidea; 

OC Rhabditidae; Peloderinae; Caenorhabditis. 

OX NCBI_TaxID=6239; 

RN [1] 

RP SEQUENCE FROM N.A. 

RA Burton J. ; 



RL Submitted (OCT-1996) to the EMBL/ GenBank/ DDB J databases. 

RN [2] 

RP SEQUENCE FROM N.A. 

RX MEDLINE=99069613; PubMed-9851916; 

RA none; 

RT "Genome sequence of the nematode C.elegans: A platform for 

RT investigating biology."; 

RL Science 282:2012-2018(1998). 

RN [3] 

RP SEQUENCE FROM N.A. 

RC STRAIN=N2 ; 

RX MEDLINE=20116099; PubMed=1064 9566; 

RA Okuda T., Haga T . , Kanai Y. , Endou H., Ishihara T., Katsura I.; 

RT "Identification and characterization of the high-affinity choline 

RT transporter."; 

RL Nat. Neurosci. 3:120-125(2000). 

DR EMBL; Z81049; CAB02847.2; -. 

DR EMBL; AB030946; BAA90483.1; -. 

DR PIR; T20037; T20037. 

DR WormPep; C48D1.3; CE27109. 

DR GO; GO: 0016020; C:membrane; IEA. 

DR GO; GO: 0005215; F : transporter activity; IEA. 

DR GO; GO:0006810; P:transport; IEA. 

DR InterPro; IPR001734; Na/solut_symport . 

DR Pfam; PF00474; SSF; 1. 

DR PROSITE; PS50283; NA_JSOLUT_SYMP_3 ; 1. 

SQ SEQUENCE 576 AA; 62427 MW; FAB09778358288D9 CRC64; 

Query Match 48.9%; Score 1453; DB 5; Length 576; 

Best Local Similarity 50.5%; Pred. No. 5.4e-96; 

Matches 295; Conservative 95; Mismatches 150; Indels 44; Gaps 9; 

Qy 7 GLIAIIVFYLLILLVGIWAAWRTKNSGSAEER SEAIIVGGRDIGLLVGGFTMTATW 62 

|::||: I I : M I : I I I I I ::|:| I : I ::: I I : I I I I I I I I I I I I 

D b 6 G I VAI V F F YVL I LWG I WAG RK SKSSKELES EAGAAT E E VMLAG RN I GT L VG I FT MT AT W 65 

Qy 63 VGGGYINGTAEAVYVPGYGLAWAQAPIGYSLSLILGGLFFAKPMRSKGYVTMLDPFQQIY 122 

11111111111:1 II I I I : I I :: I I :: I I I I M I I : I I : I I I I I I I I 

Db 66 VGGAYINGTAEALY— NGGLLGCQAPVGYAISLVMGGLLFAKKMREEGYITMLDPFQHKY 123 

Qy 123 GKRMGGLLFI PALMGEMFWAAAI FSALGATI SVI IDVDMHI SVI I SALI ATLYTLVGGLY 182 

' | : | : | | I : : : I I I : I I I I I I I I I I I I I : I I I : : I I : I I : I I I I I I I I I 
Db 124 GQRI GGLMYVPALLGETFWTAAI LSALGATLSVI LGI DMNASVTLSACI AVFYTFTGGYY 183 

Qy 183 S VAYT D WQL FC I FVGLWI S VP FAL S H P AVAD I G FT AVHAK YQ K PWL GT VD S - S EVY S WL 241 

: | | | I I I I I I I I I I I I I I : I I I : I II I I • I : I I : 

Db 184 AVAYTDWQLFCI FVGLWVCVPAAMVHDGAKDI S RNA GDWIGEIGGFKETSLWI 237 

Qy 242 DS FLLLMLGGI PWQAYFQRVLS S S SAT YAQVLS FLAAFGCLVMAI PAI LI GAI GASTDWN 301 

I I I I : I I I I I I I I I II I I I : I I I I I I : I I I : : I I : I I I 

Db 238 DCMLLLVFGGIPWQVYFQRVLSSKTAHGAQTLSFVAGVGCILMAIPPALIGAIARNTDWR 297 

Qy 302 QTAYGLPDPKTTEEA DMI LPI VLQYLCPVYI SFFGLGAVSAAVMS SADS S I LSA 355 

| | : | I : : | :: I : I I I I I ::: I I I I I I I I I I I I I I I I I : I I I 

Db 298 MTDYSPWNNGTKVESIPPDKRNMWPLVFQYLTPRWVAFIGLGAVSAAVMSSADSSVLSA 357 



Qy 



356 SSMFARNIYQLSFRQNASDKEIVWVMRITVFVFGASATAMALLTKTVYGLWYLSSDLVYI 415 



•Mil I I • • I • I • I I • I I • • I I 1 I • I II 111 • • • II M II • 

Db 358 ASMFAHNIWKLTIRPHASEKEVIIVMRIAIICVGIMATIMALTIQSIYGLWYLCADLVW 417 

Qy 416 VI FPQLLCVLFVKGTNTYGAVAGYVSGLFLRITGGEPYLYLQPLI FYPGYYPDDNGI YNQ 475 

: : I I I I I I I : : : : I I I I : : I I I I I I I : I I I I : I III : I : I 
Db 418 ILFPQLLCWYMPRSNTYGSLAGYAVGLVLRLIGGEPLVSLPAFFHYPMY TDGV — Q 472 

Qy 476 KFPFKTLAMVTSFLTNICISYLAKYLFESGTLPPKLDVFDAW ARHS EENMDKT I LV 532 

I I I : I I I : : I I : I : : I I : I I I I : I I II I I : I 

Db 473 Y FP FRT T AML S SMAT IYIVSIQSEKLFKS GRL S P EWD VMGC WN I P I DH VP LPS D VS FAV 532 

Qy 533 KNENIKL DELALVKPRQSMTLSSTFTN 559 

: I : : I I I : I : I I : I 

Db 533 SSETLNMKAPNGTPAPVHPNQQPSDENTLLHPYSDQSYYSTNSN 576 



RESULT 11 
Q7UFM6 



ID Q7UFM6 PRELIMINARY; PRT; 484 AA. 

AC Q7UFM6; 

DT 01-OCT-2003 (TrEMBLrel . 25, Created) 

DT 01-OCT-2003 (TrEMBLrel. 25, Last sequence update) 

DT - 01-OCT- 2-003 (TrEMBLrel. 25, Last annotation update) 

DE High affinity choline transporter. 

GN CHTl OR RB8472. 

OS Rhodopirellula baltica. 

OC Bacteria; Planctomycetes ; Planctomycetacia; Planctomycetales ; 

OC Planctomycetaceae; Pirellula. 

OX NCBI_TaxID=117; 

RN [1] 

RP SEQUENCE FROM N.A. 

RC STRAIN=1; 

RX MEDLINE=22735913; PubMed=12835416; 

RA Gloeckner F.O., Kube M. , Bauer M. , Teeling H., Lombardot T., 

RA Ludwig W., Gade D . , Beck A., Borzym K., Heitmann K., Rabus R. , 

RA Schlesner H . , Arnann R . , Reinhardt R. ; 

RT "Complete genome sequence of the marine planctomycete Pirellula sp . 

RT strain 1 . "; 

RL Proc. Natl. Acad. Sci. U.S.A. 100:8298-8303(2003). 

DR EMBL; BX294147; CAD78656.1; -. 

KW Complete proteome. 

SQ SEQUENCE 484 AA; 52674 MW; 79AB0135F18FEBB2 CRC64; 



Query Match 14.2%; Score 422.5; DB 16; Length 484; 

Best Local Similarity 27.6%; Pred. No. 3.9e-22; 

Matches 141; Conservative 95; Mismatches 220; Indels 55; Gaps 14; 

Qy 7 GLIAIIVFYLLI-LLVGIWAAWRTKNSGSAEERSEAIIVGGRDIGLLVGGFTMTATWVGG 65 

I I I I : : I I I : : : I : I I I I : : : I I I : I : : I I I I 

Db 2 GLIAAVLAYLLLTIAI GLLAARRVGN AQ D FMVAGR S L P L YMN FACVFATW FG - 53 

Qy 66 GYINGTAEAVY VPGYGL-AWAQAPI GYSLSLI LGGLFFAKPMRS KGYVTMLDP FQ 119 

III I I I I I I : I : I : I I I I I : : I : I : : 

Db 54 AETVLSVSATFAGQGLRAIPGDPFGFSICLVLVALFFARAFYRMDLLTIGDFYR 107 

Qy 120 QIYGKRMGGLLFIPALMGEMFWAAAIFSALGATISVI IDVDMHISVIISALIAT 173 

: II : : I : : II II : I I I III: : : : : : I I 



Db 108 KRYGRSIEVLTSWISASYLGWAAAQLTALGLVISVLGKGIGYETLTINNGIVIGFTIVA 167 

Qy 174 LYTLVGGLYSVAYTDWQLFCIFVGLW-ISVPFALSHPAVADIGFTAVHAKYQKPWLGTV 232 

I I : : I I : : I I I I I : : I I I : I I : I I I : I : : : I :. : : 
Db 168 FYTVMGGMWSVALTDMIQTFVIIIGLLWSVYMAKAAGGVSWIESARESGRLQVFPDWG 227 

Qy 233 DS SEVYSWLDS FLLLMLGGI PWQAYFQRVLS S S S AT YAQVLS FLAA- FGCLVMAI PA- 1 L 290 

I : : : : I I I I I I I I I I I I : I : I I I : : I I 

Db 228 QSGQWWIYIGGFLTAALGSIPQQDVFQRVTSAKDERTAMTGTLLGGMFYCMFAFVPMFIA 287 

Qy 291 IGAIGASTDWNQTAYGLPDPKTTEEADMILPIVLQYLCPVYISFFGLGAVSAAVMSSADS 350 

I : II : I II: I : : I I I : : I : : I : I 

Db 288 YAAWI DPDHLQQF NSDDLREVQRTLPHAVIQSTPFWVQTVFLGALVSAILSTASG 343 

Qy 351 SILSASSMFARNIYQLSFRQNASDKEIVWVMRITVEVFGASATAMALLT-KTVYGLWYLS 409 

: : I : I I : I : : I I : I I : : : I I : : I I I I I I : I : I : : 

Db 344 TLLAPSSLIVENVIR-PFRSDLDDKNMLRWLRIVLLMFGALALHQALTSNNTMYEMIQQA 402 



Qy 410 SDLVYIVIFPQLLCVLFVKGTNTYGAVAGYVSGLFLRITGGEPYLYLQPLIFYPGYYPDD 469 

: : I I : I I I I : I I I I : : : I : I I 

Db 403 YSVPLVGALVPLAVGLYWKRATTRGAMASIVSGVATWLA FEYMLPEFLIPS 453 



Qy 470 NGIYNQKFPFKTLAMVTSFLTNICISYLAKY 500 

: : : III : : I I I : 
Db 454 QLMGLAAS FLAMVWS LLDKF 474 



RESULT 12 
Q8EXG7 

ID Q8EXG7 PRELIMINARY; PRT; 4 62 AA. 

AC Q8EXG7; 

DT 01-MAR-2003 (TrEMBLrel. 23, Created) 

DT 01-MAR-2003 (TrEMBLrel. 23, Last sequence update) 

DT 01-JUN-2003 (TrEMBLrel. 24, Last annotation update) 

DE Probable sodium: solute symporter. 

GN LB245. 

OS Leptospira interrogans. 

OC Bacteria; Spirochaetes ; Spirochaetales ; Leptospiraceae; Leptospira. 

OX NCBI_TaxID=173; 

RN [1] 

RP SEQUENCE FROM N.A. 

RC STRAIN=56601 / Serogroup Icterohaemorrhagiae / Serovar lai; 

RA Ren S . ; 

RL Submitted (MAR-2 002) to the EMBL/ GenBank/DDBJ databases. 

DR EMBL; AE011612; AAN51804.1; -. 

DR GO; GO: 0016020; C:membrane; IEA. 

DR GO; GO: 0005215; F : transporter activity; IEA. 

DR GO; GO: 0006810; P: transport; IEA. 

DR InterPro; IPR001734; Na/solut_symport . 

DR Pfam; PF00474; SSF; 1. 

DR PROSITE; PS50283; NA_SOLUT_SYMP_3 ; 1. 

KW Complete proteome. 

SQ SEQUENCE 462 AA; 50487 MW; C9B0104065514C68 CRC64; 



Query Match 13.6%; Score 405.5; DB 16; Length 462; 

Best Local Similarity 27.5%; Pred. No. 6.2e-21; 

Matches 134; Conservative 103; Mismatches 204; Indels 47; Gaps 16; 



Qy 


8 


LIAI - 1 VFYLL- 1 LLVGI WAAWRTKNSGSAEERS EAI I VGGRD I GLLVGGFTMTATWVGG 
: : I I : : 1 M : 1 : 1 1 : : 1 : : : 1 1 : ... 1 : : 1 1 1 1 
MLAI S VI F YL FTT I L I GAVAS RFVS D SKDYVLAGRRLPLFLASSALFATWFGS 


65 


Db 


1 


53 


Qy 


66 


GYINGTAEAVYVPGYGLAWAQAPIGYSLSLILGGLFFAKPMRSKGYVTMLDPFQQI YGKR 

: 1 1 : : 1 1 : 1 1 : 1 1 1 1 1 1 1 1 : 1 : : 1 1 : : : 1 : 1 
ETLLG-AS S RFVEDGI LGVI EDP FGAALCLFLVGLFFARPLYRMNI LTFGDFYKNRFGRR 


125 


Db 


54 


112 


Qy 


126 


MGGLLFI PALMGEMFWAAAI FSALGATI S VI I DVDMHI SVI I SALI ATLYTLVGGLY 

: : ||: 1 1 1 1 1 1 1 1 : 1 : : : 1 1 : : 1 1 : 1 1 : = 
AEILSSVFMIPSYFG WI AAQFVALGI I FHSLADI PVSTGI IAGAGWLI YTVTGGMW 


182 


Db 


113 


169 


Qy 


183 


S VAYT DWQ L FC I FVGLW I S VP FAL S H P AVAD I G FT AVHAK YQ K P WLGTVDSSEVY 

::: II :| 1 -II 1 : II 1 1 : 1 II : ::: :: 

AISLTDFLQTVLIVLGLSYLV-WDLSSKAG GIEKILAS-TKPGFFRFFPEMNAKSIF 


238 


Db 


170 


224 


Qy 


239 


SWLDS FLLLMLGGI PWQAYFQRVLS S SSAT YAQVLS FLAAFGCLVMA- 1 PAI LI GAI GAS 

::::::: 1 1 1 1 1 1 1 1 1 :: 1 1 1 1 1 : 1 1 : 1 : 1 II : 1 

AYIAAWMTIGLGSIPQQDIFQRVMASKSEKVAVYSSLLGSFFYLSVAFLP — LIAVLCAR 


297 


Db 


225 


282 


Qy 


298 


TDWNQTAYGLPDPKTTEEADMILPIVLQYLCPVYISFFGLGAVSAAVMSSADSSILSASS 

: : I 1 : 1 1 1 1 1 : : : | | : : | | 1 1 : 1 : 1 1 : : : 1 
KIYPEIA KEDAQMILPKTVLTHTGLFTQILFFGALLSAVMSTASGAI LAS AS 


357 


Db 


283 


334 


Qy 


358 


MFARNIYQLSFRQNASDKEIVWVMRITVFVFGASATAMALLTKTVYGLWYLSSDLVYIVI 
: | : : | : : | : : : : : | : : | : : I I : I I : | | : : 


417 


Db 


335 


VLGENVIRPFFKK-TSERTLLRLFRLSVIAITLVSLSMANTKSNIYELVSQASALSLVSL 


393 


Qy 


A 1 O 

4 lo 


1 I : 1 1 1 : : 1 1 : : 1 III II : I I : : 

FI PLVAGLFRKNSTSTGAI FSMI VGFCTWFLCNI LSLEI PAS I PGLI S SWIALYLGDWME 


468 


Db 


394 


453 


Qy 


469 


DNGIYNQK 476 




Db 


454 


1 1 1 1 

HRG-YIQK 460 





RESULT 13 
Q8Y273 

ID Q8Y273 PRELIMINARY; PRT; 479 AA. 

AC Q8Y273; 

DT 01-MAR-2002 (TrEMBLrel. 20, Created) 

DT 01-MAR-2002 (TrEMBLrel. 20, Last sequence update) 

DT 01-JUN-2003 (TrEMBLrel. 24, Last annotation update) 

DE Probable sodium/ solute symporter transmembrane protein. 

GN RSC0463 OR RS04434. 

OS Ralstonia solanacearum (Pseudomonas solanacearum) . 

OC Bacteria; Proteobacteria ; Betaproteobacteria; Burkholderiales ; 

OC Burkholderiaceae; Ralstonia. 

OX NCBI_TaxID=305; 

RN [1] 

RP SEQUENCE FROM N.A. 

RC STRAIN=GMI1000; 

RX MEDLINE-21681879; PubMed=11823852 ; 

RA Salanoubat M. , Genin S., Artiguenave F., Gouzy J . , Mangenot S., 

RA Arlat M., Billault A., Brottier P., Camus J.C., Cattolico L., 



RA Chandler M. , Choisne N . , Claudel-Renard C, Cunnac S., Demange N-, 

RA Gaspin C, La vie M. , Moisan A- , Robert C, Saurin W., Schiex T., 

RA Siguier P., Thebault P., Whalen M. , Wincker P . , Levy M. , 

RA Weissenbach J., Boucher C.A. ; 

RT "Genome sequence of the plant pathogen Ralstonia solanacearum. " ; 

RL Nature 415:497-502(2002). 

DR EMBL; AL646059; CAD13991.1; -. 

DR GO; GO: 0016020; C : membrane; IEA. 

DR GO; GO:0005215; F : transporter activity; IEA. 

DR GO; GO: 0006810; P:transport; IEA. 

DR InterPro; IPR001734; Na/solut_symport . 

DR Pfam; PF00474; SSF; 1. 

DR PROSITE; PS50283; NA_SOLUT_SYMP_3 ; 1. 

KW Complete proteome. 

SQ SEQUENCE 479 AA; 52091 MW; 560962E4 11DBC9B8 CRC64; 

Query Match 12.8%; Score 381.5; DB 16; Length 479; 

Best Local Similarity 28.2%; Pred. No. 3.4e-19; 

Matches 128; Conservative 85; Mismatches 188; Indels 53; Gaps 16; 

Qy 11 IIVFYLLILLVGIWAAWRTKNSGSAEERSEAIIVGGRDIGLLVGGFTMTATWVGGGYING 70 

: I I :::::: I : I I I I : I : III: I I : I I I I : I 

Db 6 VIVYWVT SVGI GLWAALRVRNTAD FAVAGRGL P F YWT ATVFATWFGS ETVLG 58 

Qy 71 TAEAVTVPGYGLAWAQA--PIGYSLSLILGGLFFAKPMRSKGYVTMLDPFQQIYGKRMGGL 12 9 

II:: II I I I I I I I I I I I I I : I : : I : I : : : I : | 

Db 59 - 1 PAVFLK- EGLHGWADP FGS S LCLI LVGLFFARP L YRMNLLT I GDFYRNRFGRVAEVL 116 

Qy 130 LFI PALMGEMFWAAAI FSALGATI SVI ID — VDMHI SVI I SALIATLYTLVGGLYSVAYT 187 

: : : : I I I III : I : : : I I : I I I I I : : I I I I 

Db 117 TTLCIWSYLGWVAAQIKALGLVFYTVSDGGLSQQTGMMIGAASVLVYTLFGGMWSVAVT 17 6 

Qy 188 DWQL FCI FVGLW I S VP FAL SH PAVADI GFTAVHA KYQKPWLGTVDSSEVYSWLDS 243 

I : I : I : I : : : : : I I : II |: : : || :: : 

Db 177 DFIQMI 1 1 VI GM-MYI GWEVSGQA-GGVATVVAHASAAGKFS — FWPAFNPIEVIGFVTA 232 

Qy 244 FLLLMLGGIPWQAYFQRVLSS S SAT YAQVLS FLAAFGCLVMAI PAI LI GAI GA 296 

: : : I I I I I I I I I I I I : : : I I I I II : : I III 
Db 233 WITMMLGSIPQQDVFQRVTSSRTERIAGTASVLGGVLYFLFAFIPMFLAYSATLI 287 

Qy 297 STDWNQTAYGLPDPK TTEEADMILP-IVLQYLCPVYISFFGLGAVSAAVMSSADS 350 

II: : : : | | | : | | : : | : I I : : I : I ! : 

Db 288 DPQMVARYINTDSQLILPKLVLEH-APLVAQVMFFGALLSAIKSCASA 334 

Qy 351 S I LS AS SMFARN I YQLS FRQNAS DKEI VWVMRI TVFVFGASATAMALLT K-TVYGLWYLS 409 

: : I : I I I I : : I I : I I : I I I I I I I : : : : : : 

Db 335 T LLAP S VT FAENVLR- PML P RMDDKRFLRVMQAWLVFTALVT LFALN S HLS I FHMVENA 393 

Qy 410 SDLVYIVIFPQLLCVLFVKGTNTYGAVAGYVSGL 443 

: : I I III I : II 

Db 394 YKVTLVAAFVPLAFGLFWKRATRQGGLLAIALGL 427 



RESULT 14 
Q9V2P3 

ID Q9V2P3 PRELIMINARY; PRT; 492 AA. 

AC Q9V2P3; 



DT 01-MAY-2000 (TrEMBLrel. 13, Created) 

DT 01-MAY-2000 (TrEMBLrel. 13, Last sequence update) 

DT 01-JUN-2003 (TrEMBLrel. 24, Last annotation update) 

DE Proline symporter (Proline permease) . 

GN PUTP-3 OR PYRAB00320 OR PAB2354. 

OS Pyrococcus abyssi. 

OC Archaea; Euryarchaeota; Thermococci ; Thermococcales ; Thermococcaceae; 

OC Pyrococcus . 

OX NCBI_TaxID=2 9292; 

RN [1] 

RP SEQUENCE FROM N.A. 

RC STRAIN-GE5 / Orsay; 

RA Heilig R. ; 

RT "Pyrococcus abyssi genome sequence: insights into archaeal chromosome 

RT structure and evolution."; 

RL Submitted (JUL-1999) to the EMBL/ GenBank/ DDB J databases. 

DR EMBL; AJ248283; CAB48955.1; -. 

DR PIR; D75188; D75188. 

DR GO; GO: 0016020; C:membrane; IEA. 

DR GO; GO: 0005215; F: transporter activity; IEA. 

DR GO; GO: 0006810; P: transport; IEA. 

DR InterPro; IPR001734; Na/solut_symport . 

DR Pfam; PF00474; SSF; 1. 

DR TIGRFAMs; TIGR00813; sss; 1. 

DR PROSITE; PS50283; NA_SOLUT_SYMP_3 ; 1. 

KW Complete proteome. 

SQ SEQUENCE 492 AA; 53457 MW; A7C72B1AF29282B3 CRC64; 



Query Match 11.6%; Score 344; DB 17; Length 492; 

Best Local Similarity 24.2%; Pred. No. 1.7e-16; 

Matches 132; Conservative 99; Mismatches 196; Indels 118; Gaps 25 

Qy 8 LIAIIVFYLLILLVGIWAAWRTKNSGSAEERSEAIIVGGRDIGLLVGGFTMTATWVGGGY 67 

I : I : : I : I I I : I I I I : I INI:: : : : 

Db 14 L VAFL FT L I L P I LVG F YAMKRT K S EEDFFVGGRAMDKITVALSAVSSGRSSWL 66 

Qy 68 INGTAEAVYVPGYGLAWAQAPIGYSLS LILGGLFFAKPMRSKGYVTMLDPFQQI YG 123 

: I : II | | : | | : : : I : I : | : | I : : 

Db 67 VLGLSGMAYKMGVTAVW — AAVGYIVAEMFQFVYMGIRLRKFSERFNAITVPDYFEARFR 124 

Qy 124 K RMGG LLFI PALMGEMFWAAAI FSALGATI SVI I DVDMHI SVI I SALI ATL 174 



Db 125 DTSKILRIAASIIIIIFLTSYVGAQFNAGA KTLSTALGI S I FTALMI SVLMI IV 178 

Qy 17 5 YTLVGGLYSVAYTDWQLFCI FVGLWI SVPFALSHPAVADI GFT AVHAKYQK 22 6 

I : : I I : I I I I I : : : : I I : I I I I : I I I : I 

Db 179 YMI LGGFI AVAYNDVI RAVI MI I GLW LPVIAVAKVGGTEEVLKVLHALDPKLIN 2 33 

Qy 227 PW LGTVDS S EVYSWLDS FLLLMLG- GI PWQAY- FQRVLS S S SAT YAQVLS FLAAFGC 281 

II II I : I I I I : 1:1 : I : : I 

Db 234 PWAFGAGWIG FLGI GFGS PGQPHI I VRYMS I DDPNKLRVST WGTFWN 282 

Qy 282 LVMAI PAILIGAIGASTDWNQTAYGLPDPKTT — EEADMILP-IVLQYLCPVYISFFGLG 338 

: I : I I I : I I : : I I : I : I I I : I I I : : I 

Db 283 WLAWGAIFVGLAGRAI VPDVSQLPGKNAEMIYPYLSAQYFPPILYGIL-IG 333 



Qy 



339 AVS AAVMS SAD S S I L S AS SMFARN I YQL S FRQNA — SDKEIVWVMRITVFVFGASATAMA 396 



: ||::|:||| :| :| :: :| : : 1:11 I I I : I" 

Db 334 GI FAAI LS TADS QLL WAST WKDLYQEVI KKGTKI DEKTALTI SRVTVLWGFLAAI LA 393 



Qy 



397 LLTKTVYGLWYLSSDLVY-IVIF PQLLCVLFVKGTNTYGAVAGYVS GLFL 445 

| : : | : :: I : I I I : hill : I : I I : I 

Db 394 YVAKDIIFWFVLFAWGGLGASFGPTLILSLYWKGTTKWGVLAGMIVGTIT 443 



Qy 446 RITGGEPYLYLQPLIFYPGYYPDDNGIYNQKFPFKTLAMVTSFLTNICISYLAKYLFESG 505 

I I I I : I : I : I : I | : | : | : | : I 
Db 444 TIVW KLYLKPI TGLY-ELVP AFIFSLIATIIVSMITK 479 

Qy 506 TLPPK 510 

I I : 

Db 480 — PPE 482 



RESULT 15 
Q81Y52 

ID Q81Y52 PRELIMINARY; PRT; 492 AA. 

AC Q81Y52; 

DT 01-JUN-2003 (TrEMBLrel. 24, Created) 

DT 01-JUN-2003 (TrEMBLrel. 24, Last sequence update) 

DT 01-OCT-2003 (TrEMBLrel. 25, Last annotation update) 

DE Sodium/proline symporter family protein. 

GN BA37 05. 

OS Bacillus anthracis (strain Ames). 

OC Bacteria; Firmicutes; Bacillales; Bacillaceae; Bacillus. 

OX NCBIJTaxID=198094; 

RN [1] 

RP SEQUENCE FROM N.A. 

RX MEDLINE=22608414; PubMed-12721629 ; 

RA Read T.D., Peterson S.N., Tourasse N. , Baillie L.W., Paulsen I.T., 

RA Nelson K.E., Tettelin H., Fouts D.E., Eisen J. A. , Gill S.R., 

RA Holtzapple E.K., Okstad O.A. , Helgason E. , Rilstone J., Wu M. , 

RA Kolonay J.F., Beanan M.J,, Dodson R.J., Brinkac L.M. , Gwinn M. , 

RA DeBoy R.T., Madpu R. , Daugherty S.C., Durkin A.S., Haft D.H., 

RA Nelson W.C., Peterson J.D., Pop M. , Khouri H.M., Radune D. , 

RA Benton J.L., Mahamoud Y., Jiang L., Hance I.R., Weidman J.F., 

RA Berry K.J., Plaut R.D., Wolf A.M., Watkins K.L., Nierman W.C., 

RA Hazen A., Cline R. , Redmond C, Thwaite J.E., White O., Salzberg S.L., 

RA Thomason B. , Friedlander A.M., Koehler T.M., Hanna P.C., Kolsto A.-B., 

RA Fraser CM. ; 

RT "The genome sequence of Bacillus anthracis Ames and comparison to 

RT closely related bacteria."; 

RL Nature 423:81-86(2003). 

DR EMBL; AE017035; AAP27454.1; ~. 

DR TIGR; BA3705; 

DR GO; GO: 0016020; C:membrane; IEA. 

DR GO; GO: 0005215; F: transporter activity; IEA. 

DR GO; GO: 0006810; P:transport; IEA. 

DR InterPro; IPR001734; Na/solut_symport . 

DR Pfam; PF00474; SSF; 1. 

DR TIGRFAMs; TIGR00813; sss; 1. 

DR PROSITE; PS50283; NA_SOLUT_SYMP_3 ; 1. 

KW Complete proteome. 

SQ SEQUENCE 492 AA; 53891 MW; E2377D735C1A90F9 CRC64 ; 



Query Match 11.2%; Score 334; DB 16; Length 492; 

Best Local Similarity 22.9%; Pred. No. 9e-16; 

Matches 129; Conservative 101; Mismatches 219; Indels 114; Gaps 17; 

Qy 5 VEGLIAIIVFYLLILLVGIWAAWRTKNSGSAEERSEAIIVGGRDIGLLVGGFTMTATWVG 64 

: I : : : : : : : I : I I : : I : : : I I I : I I : I : : 

Db 3 IEIMVSLAI YMAGMLYIGYWSYKKTSDLSD YML G G RG L G P AVT AL SAGAS DMS 55 

Qy 65 GGYINGTAEAVYVPGYGLAWAQAPIGYSLSLILGG LFFAKPMR SKGYVTML 115 

I : I |:|| I : : I : : I I I : I : : I : 

Db 56 GWMLMGL P GAMY AT GL S S VW IAIGLLIGAYANYLILAPRLRTYTEVANDSITIP 109 

Qy 116 DPFQQIYGKRMGGLLFIPA LMGEMFWAAAI F SAL GAT I SVI I DVDMH I SVI I SAL I A 172 

I : : I I I : I I : I : I : I : I : : I I : : : : 

Db 110 DFLENRFKDRTKILRFVSAIVILVFFTFYASAGLVSGGRLFENSFNLDYKIGLFVTVGW 169 

Qy 173 TLYTLVGGLYSVAYTDWQLFCIFVGLWISVPFALSHPAVADIG FTAVHAKYQKP 227 

I I I I I : I :: I I II : I : I : I I I I : I I = 

Db 170 VAYTLFGGFLAVSWTDFVQGCIMFIAL-VLVPIV AFT DVGGVT ET FNT I K 218 

Qy 228 WLGTVDSSEVYSWLDSFLLLMLGGIPW-QAYF QRVL S S S S AT YAQVL S FLAAFG 280 

M : I : : : : I : : : I II I : : : I : : 

Db 219 — •-QVDASHLDMFKGTTILGIISFLAWGLGYFGQPHIIVRFMAITSIKDLKTSRRIGIGW 275 

Qy 281 C LVMAI PAI L I GAI GAS T DWNQTAYG L P D P KTT E EADMI L P I VLQ YL C P VYI S F FGLGAV 340 

: I I : I I : I II : I : : : I : I I I : I I I : 

Db 276 MTISIIGAMLTGLVG IAYYAKNNATLQDPEMVFVTFSNILFHPYITGFLLSAI 328 

Qy 341 S AAVMS SAD S S I L S AS SMFARN I YQ L S FRQNAS DKE I VWVMRI T VFVFGAS AT AMALLT K 400 

I : : I M I : I II : ! : I I : I I I I I : I : : I : : I I I : I 
Db 329 LAS IMS SI S SQLLVI S SAVTEDFYKT FFRRKASDKELVFI GRLSVLWAMIAWLA 384 

Qy 401 TVYGLWYLSSDLVYIVI FPQLLCVLFVKGTNTYGAVAGYVSGLFLRITG 449 

I ||: : : I : I I : I I I : I : I I : I : I I 

Db 38 5 YHP S DT I LTLVG YAWAGFGSAFGPAI LLS LYWKRTNKWGVLAGMI VGAL VVI TW 438 

Qy 450 GE-PYLYLQPLIFYPGYYPDDNGIYNQKFPFKTLAMVTSFLTNICISYLAKYLFESGTLP 508 

: I I ||:: II I : I :~ I 
Db 439 VQIPSLKASMYEMVPGFF CSLLAVIIVSLVTK 470 

Qy 509 PKLDVFDAWARHSEENMDKTIL 531 

MINI -I 

Db 471 E P VKAI H RE FN EMEAVL 487 



Search completed: March 22, 2004, 15:35:00 
Job time : 113 sees 



GenCore version 5.1.6 
Copyright (c) 1993 - 2004 Compugen Ltd. 



OM protein - protein search, using sw model 

Run on: March 22 , 2004 , 13:57:40 ; Search time 26 Seconds 

(without alignments) 
1161.565 Million cell updates/sec 

Title: US-1 0-069-54 1-6 

Perfect score: 2972 

1 MAFHVEGLI AI IVFYLLI LL EAFLDVDS S PEGSGTEDNLQ 580 



Sequence : 
Scoring table: 



BLOSUM62 

Gapop 10.0 , Gapext 0.5 



Searched: 141681 seqs, 52070155 residues 

Total number of hits satisfying chosen parameters: 

Minimum DB seq length: 0 

Maximum DB seq length: 2000000000 

Post-processing: Minimum Match 0% 

Maximum Match 100% 
Listing first 45 summaries 



141681 



Database : 



SwissProt 42:* 



Pred. No. is the number of results predicted by chance to have a 
score greater than or equal to the score of the result being printed, 
and is derived by analysis of the total score distribution. 
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32 
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33 
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34 
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1 
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35 
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1 
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36 
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1 
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37 
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1 
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38 
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41 
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4 


.1 
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1 
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42 
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1 
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43 
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.0 
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1 
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44 
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.0 
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1 
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45 
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4 


.0 
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1 


MLEN BACSU 


P54571 


bacillus su 



ALIGNMENTS 



RESULT 1 
SL51_RABIT 

ID SL51_RABIT STANDARD; PRT; 662 AA. 

AC P11170; 

DT 01-JUL-1989 (Rel. 11, Created) 

DT 01-JUL-1989 (Rel. 11, Last sequence update) 

DT 15-JUL-1998 (Rel. 36, Last annotation update) 

DE Sodium/glucose cotransporter 1 (Na (+) /glucose cotransporter 1) 

DE (High affinity sodium-glucose cotransporter) . 

GN SLC5A1 OR SGLT1. 

OS Oryctolagus cuniculus (Rabbit) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 

OC Mammalia; Eutheria; Lagomorpha; Leporidae; Oryctolagus. 

OX NCBI_TaxID=9986; 

RN [1] 

RP SEQUENCE FROM N.A. 

RC STRAIN^New Zealand white; 

RX MEDLINE=8 8065856; PubMed=2446136; 

RA Hediger M.A. , Coady M.J., Ikeda T.S., Wright E.M. ; 

RT "Expression cloning and cDNA sequencing of the Na+/glucose co- 

RT transporter."; 

RL Nature 330:379-381(1987). 

RN [2] 

RP SEQUENCE FROM N.A. 



RC STRAIN-New Zealand white; TISSUE=Kidney cortex; 

RX MEDLINE-91223090; PubMed=2025641 ; 

RA Morrison A.I., Panayotova-Heiermann M. , Feigl G. , Schoelermann B., 

RA Kinne R.K.H. ; 

RT "Sequence comparison of the sodium-D-glucose cotransport systems in 

RT rabbit renal and intestinal epithelia . " ; 

RL Biochim. Biophys. Acta 1089:121-123(1991). 

CC -!- FUNCTION: Actively transports glucose into cells by Na( + ) co- 

CC transport with a Na( + ) to glucose coupling ratio of 2:1. 

CC -!- FUNCTION: Efficient substrate transport in mammalian kidney is 

CC provided by the concerted action of a low affinity high capacity 

CC and a high affinity low capacity Na (+) /glucose cotransporter 

CC arranged in series along kidney proximal tubules. 

CC -!- SUBCELLULAR LOCATION: Integral membrane protein. 

CC -!- TISSUE SPECIFICITY: Found predominantly in intestine and in outer 
CC renal medulla. 

CC -!- DISEASE: Mutation of Asp-28 is implicated in glucose/galactose 
CC malabsorption. 

CC -!- SIMILARITY : BELONGS TO THE SODIUM: SOLUTE SYMPORTER FAMILY (SSF). 

CC 

CC This SWISS-PROT entry is copyright. It is produced through a collaboration 

CC between the Swiss Institute of Bioinf ormatics and the EMBL outstation - 

CC the European Bioinf ormatics Institute. There are no restrictions on its 

CC use by non-profit institutions as long as its content is in no way 

CC modified and this statement is not removed. Usage by and for commercial 

CC entities requires a license agreement (See http://www.isb-sib.ch/announce/ 

CC or send an email to license@isb-sib . ch) . 



CC 












DR 


EMBL; X06419; CAA29727. 


1; - 




DR 


EMBL; X55355; CAA39040. 


1; - 




DR 


PIR; S00515; A37226. 






DR 


InterPro; 


IPR001734; Na/solut symport. 


DR 


Pfam; PF00474; SSF; 1. 






DR 


TIGRFAMs; 


TIGR008 


13; s s s ; 1 




DR 


PROSITE; 


PS00456; 


NA_SOLUT_ 


SYMP_1; 1. 


DR 


PROSITE; 


PS00457; 


NA_SOLUT_ 


SYMP 2; 1. 


DR 


PROSITE; 


PS50283; 


NA_SOLUT_ 


SYMP_3; 1. 


KW 


Transport; Sugar 


transport ; 


Transmembrane; Sodium trans 


KW 


Glycoprotein. 








FT 


DOMAIN 


1 


28 




CYTOPLASMIC (POTENTIAL) . 


FT 


TRANSMEM 


29 


47 




POTENTIAL. 


FT 


DOMAIN 


48 


64 




EXTRACELLULAR (POTENTIAL) . 


FT 


TRANSMEM 


65 


85 




POTENTIAL. 


FT 


DOMAIN 


86 


105 




CYTOPLASMIC (POTENTIAL) . 


FT 


TRANSMEM 


106 


126 




POTENTIAL. 


FT 


DOMAIN 


127 


171 




EXTRACELLULAR (POTENTIAL) . 


FT 


TRANSMEM 


172 


191 




POTENTIAL. 


FT 


DOMAIN 


192 


208 




CYTOPLASMIC (POTENTIAL) . 


FT 


TRANSMEM 


209 


229 




POTENTIAL. 


FT 


DOMAIN 


230 


270 




EXTRACELLULAR (POTENTIAL) . 


FT 


TRANSMEM 


271 


291 




POTENTIAL. 


FT 


DOMAIN 


292 


314 




CYTOPLASMIC (POTENTIAL). 


FT 


TRANSMEM 


315 


334 




POTENTIAL. 


FT 


DOMAIN 


335 


423 




EXTRACELLULAR (POTENTIAL) . 


FT 


TRANSMEM 


424 


443 




POTENTIAL. 


FT 


DOMAIN 


444 


455 




CYTOPLASMIC (POTENTIAL) . 


FT 


TRANSMEM 


456 


476 




POTENTIAL. 



FT 


DOMAIN 


477 


c o 

526 


T-i t r m m\ /^pt t ttt T\n / Ti /*"\rp cxiTi "TAT \ 

EXTRACELLULAR (POThNllALy . 


FT 


TRANSMEM 


527 


547 


POTENTIAL . 


FT 


DOMAIN 


548 


64 0 


CYTOPLASMIC (POTENTIAL) . 


FT 


TRANSMEM 


641 


661 


POTENTIAL . 


FT 


DOMAIN 


662 


662 


EXTRACELLULAR (POTENTIAL). 


FT 


CAKBOrliD 








FT 


SITE 


43 


43 


IMPLICATED IN SODIUM COUPLING 


FT 








(BY SIMILARITY) . 


FT 


SITE 


300 


300 


IMPLICATED IN SODIUM COUPLING 


FT 








(BY SIMILARITY) . 


SQ 


SEQUENCE 


662 AA; 


73079 


MW; 03F55A0309CBBE01 CRC64; 



Query Match 10.4%; Score 308.5; DB 1; Length 662; 

Best Local Similarity 23.4%; Pred. No. 2.6e-14; 

Matches 154; Conservative 110; Mismatches 238; Indels 155; Gaps 26; 

1 1 VFYLLI LLVGI WAAWRTKNS GSAEERS EAI I VGGRD I GLLVGGFTMTATWVGGGYING 7 0 
|::::|::: I I : I I : I I I : : II : | :: |: :| |: I 

IVIYFLWMAVGLWAMFST-NRGTV GGFFLAGRSMVWWPIGASLFASNIGSGHFVG 8 6 

TAEAVYVP G YGLAWAQAP I GY S LSLILGGLFFAKPMRSKGYVTMLDPFQQI Y-GK 124 

| Ml | | : : : : | | : | : I : I I I I : I : : I I 



QY 


11 


Db 


32 


QY 


71 


Db 


87 


Qy 


125 


Db 


140 


Qv 


182 


Db 


198 


Qy 


226 


Db 


254 


Qy 


268 


Db 


308 


Qy 


322 


Db 


368 


Qy 


382 


Db 


427 


Qy 


437 


Db 


487 


Qy 


474 


Db 


535 



| : | | : | : : I : I I I I II I : : : I ::::: I I : I I I I : I I I 
RIQIYLSILSLLLYIFTKISADIFS — GAIFIQLTLGLDIYVAIIILLVTTGLYTITGGL 197 

Y S VAYT D WQ L FC I FVGLW I S VP FAL S H P AVAD I G FT AVH AK Y Q 225 

: I I I I : I : I I I I I hi I I I 

AAVI YTDTLQTAIMMVGSVILTGFAFHEVG GYEAFTEKYMRAIPSQISYGNTSIPQ 253 

KPWLGTVDS S EVYSWLDS FLLLMLGGI PW QAYFQRVLS S S SA 267 

I : I : : I : I III I 1111- 

KCYTPREDAFHI FRDAITGDIPWPGLVFGMSILTLWYWCTDQVIVQRCLSAKNL 307 

T YAQVL S FLAAFGC LVMAI PAI L I GAI GAS T DWNQTAYGL P D P KTTEEADMILP 321 

: : : | : : : : : : I : : : I : I : - I 

S HVKAGCI LCGYLKVMPMFLI VMMGMVS RI LYTDKVACWP S ECERYCGTRVGCTNI AFP 367 

IVLQYLC PVYI S FFGLGAVSAAVMS S ADS S ILSAS SMFARN I YQLS FRQNAS DKEIVWVM 381 

: : || : I : I : : I I I I I I I : : I : I I I : I I : I I : : 

TLWELMPNGLRGLMLSVMMASLMSSLTSIFNSASTLFTMDIY-TKIRKKASEKELMIAG 42 6 

RI-TVFVFGASATAMALLTKTVYG— LWYLSSDLVYI--VIFPQLLCVLFVKGTNTYGAV 4 36 
| : : | : | | : : : I I : I I : I I : I I I I I 



— TG GEPYLYLQPLIFYPGYYPDDNGIY 473 

II I I I I :: I 

fGTGSCMEPSNCPTIICGVHYLYFAIILF 534 



I I : I : : I I : I : : : : I : I : I 
-VISIITWWSLFTKPI PDVHLYRLCWSLRNSKE 568 



Qy 



533 KNEN I KLD- - ELALVKPRQSMTLS STFTNKEAF- 



LDVDSSPEGSGTED 577 



I I I I I : : : I : I : I III h : h 

Db 569 — ERIDLDAGEEDIQEAPEEATDTEVPKKKKGFFRRAYDLFCGLDQDKGPKMTKEEE 623 



RESULT 2 
SL52_RAT 

ID SL52_RAT STANDARD; PRT; 670 AA. 

AC P53792; 

DT 01-OCT-1996 (Rel. 34, Created) 

DT 01-OCT-1996 (Rel, 34, Last sequence update) 

DT 15-MAR-2004 (Rel. 43, Last annotation update) 

DE Sodium/glucose cotransporter 2 (Na (+) /glucose cotransporter 2) 

DE (Low affinity sodium- glucose cotransporter) . 

GN SLC5A2 OR SGLT2 . 

OS Rattus norvegicus (Rat) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; 

OC Mammalia; Eutheria; Rodentia; Sciurognathi ; Muridae; Murinae; Rattus. 

OX NCBI_TaxID=10116; 

RN [1] 

RP SEQUENCE FROM N.A. 

RC STRAIN=Sprague-Dawley; TISSUE=Kidney; 

RX MEDLINE=96094332; PubMed=74 9397 1 ; 

RA You G., Lee W.-S., Barros E.J.G., Kanai Y. , Huo T.-L., Khawaja S . , 

RA Wells R.G., Nigam S.K., Hediger M.A. ; 

RT "Molecular characteristics of Na (+) -coupled glucose transporters in 

RT adult and embryonic rat kidney. "; 

RL J. Biol. Chem. 270:29365-29371(1995). 

CC -!- FUNCTION: Sodium-dependent glucose transporter. Has a Na+ to 
CC glucose coupling ratio of 1:1. 

CC -!- FUNCTION: Efficient substrate transport in mammalian kidney is 

CC provided by the concerted action of a low affinity high capacity 

CC and a high affinity low capacity Na (+) /glucose cotransporter 

CC arranged in series along kidney proximal tubules . 

CC -!- SUBCELLULAR LOCATION: Integral membrane protein. 

CC -!- TISSUE SPECIFICITY: Kidney, in proximal tubule SI segments. 

CC -!- DEVELOPMENTAL STAGE: Appears on embryonic day 17 and gradually 

CC increases until day 19. Decreases between day 19 and birth. 

CC -!- PTM: GLYCOSYLATED AT A SINGLE SITE. 

CC -!- SIMILARITY: BELONGS TO THE SODIUM: SOLUTE SYMPORTER FAMILY (SSF). 

CC ■ 

CC This SWISS-PROT entry is copyright. It is produced through a collaboration 

CC between the Swiss Institute of Bioinf ormatics and the EMBL outstation - 

CC the European Bioinf ormatics Institute. There are no restrictions on its 

CC use by non-profit institutions as long as its content is in no way 

CC modified and this statement is not removed. Usage by and for commercial 

CC entities requires a license agreement (See http://www.isb-sib.ch/announce/ 

CC or send an email to license@isb-sib . ch) . 

CC 

DR EMBL; U29881; AAC52325.1; 

DR InterPro; IPR001734; Na/solut_symport . 

DR Pfam; PF00474; SSF; 1. 

DR TIGRFAMs ; TIGR00813; sss; 1. 

DR PROSITE; PS00456; NA_SOLUT_S YMP_1 ; 1. 

DR PROSITE; PS00457; NA_SOLUT_SYMP_2 ; 1. 

DR PROSITE; PS50283; NA_SOLUT_SYMP_3 ; 1. 

KW Transport; Sugar transport; Transmembrane; Sodium transport; Symport; 

KW Glycoprotein. 
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CYTOPLASMIC ( rUl EN 1 1AL) . 


FT 


TRANSMEM 


24 


42 


POTENTIAL . 


FT 
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43 


59 


EXTRACELLULAR ( rUl CjJn 1 1AL J . 


FT 


TRANSMEM 


60 


80 


POTENTIAL . 


FT 
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81 


100 


CYTOPLASMIC ( POJ. EN 1 IAL) . 


FT 


TRANSMEM 


101 


121 


POTENTIAL . 


FT 


DOMAIN 


122 


166 


t—i\7 m niv^riT T ITT Tin / T"l/">m T7XTfT T 7\ T V 

EXTRACELLULAR (POTENTIAL). 


FT 


TRANSMEM 


167 


187 


POTENTIAL. 


FT 


DOMAIN 


188 


203 


AT7ITIA t-\ -r t\ |-i > r T / f~\ FT! TT'XTm T 7\ T \ 

CYTOPLASMIC (POTENTIAL) . 


FT 


TRANSMEM 


204 


224 


POTENTIAL . 


FT 


DOMAIN 


225 


268 


n\r m r~\ i\ /~« t-itttTT7\T» / T» /"\m TPXTm TT\T \ 

EXTRACELLULAR (POIENIIAL) . 


FT 


TRANSMEM 


269 


289 


POTENTIAL . 


FT 


DOMAIN 


290 


312 


CYTOPLASMIC (POTENTIAL) . 


FT 


TRANSMEM 


313 


332 


POTENTIAL. 


FT 


DOMAIN 


333 


421 


t— l t r m r\ ~r\ ~n T 1" T T T TV T*t / t™\ Am T"< XTITl T 7V T \ 

EXTRACELLULAR (POTENTIAL) . 


FT 


TRANSMEM 


422 


441 


POTENTIAL. 


FT 


DOMAIN 


442 


453 


CYTOPLASMIC (POTENTIAL). 


FT 


TRANSMEM 


454 


474 


POTENTIAL. 


FT 


DOMAIN 


475 


524 


EX 1 RAChiljJLiUijAK ( rUl EjN 1 1J\L> ) . 


FT 


TRANSMEM 


525 


545 


POTENTIAL. 


FT 


DOMAIN 


546 


648 


CYTOPLASMIC (POTENTIAL) . 


FT 


TRANSMEM 


649 


669 


POTENTIAL. 


FT 


CARBOHYD 


248 


248 


N-LINKED (GLCNAC. . . ) (PROBABLE) 


SQ 


SEQUENCE 


670 AA; 


72961 


MW; 0609562861618BB3 CRC64; 



Query Match 10.4%; Score 308; DB 1; Length 670; 

Best Local Similarity 23.3%; Pred. No. 2.9e-14; 

Matches 148; Conservative 95; Mismatches 209; Indels 184; Gaps 28; 



Qy 8 LI AI I VFYLLI LLVGI WAAWRT KN S GSAEERS EAI I VGGRDI GLLVGGFTMTAT WGGGY 67 

: : | : : | I : : I I : I : : I I I I : : I I : | : : | : : | I : 

Db 24 I LVI AAYFLLVI GVGLWSMFRT-NRGTV GGYFLAGRSMVWWPVGAS L FASNI GS GH 78 



Qy 68 I N GT AEA VYVP G YG LAWAQAP I G Y S LSLILGGLFFAKPMRSKGYVTMLDPFQQIY 122 

II III [ I : : I : I I I I : : I : I I I 
Db 7 9 FVGLA GT GAAS G LAVAG F E WN AL FWL L L GW L FVP VYL - T AGVI TM PQYL 127 

Qy 123 GKRMGG LLFI PALMGEMFWAAAI F — SALGATISVIIDVDMHISVIIS 168 

I I I I I : I : : : I : I III I : I I I 

Db 128 RKRFGGRRIRLYLSVLSLFLYIFTKISVDMFSGAVFIQQALGWNI YASVIAL 179 

Qy 169 AL I AT L YT L VG G L Y S VAYT DWQ LFCIFVGLWISVP FAL S H P AVAD I G FT AVHAK Y 224 

I : I I : I I I :: I I I I I I I I : I : I | : : : I I 

Db 180 LG I TMI YT VT GGLAALMYT DT VQT FVI LAGAFI LT GYAFHEVG GYSGLFDKYLGAV 235 

Qy 225 QKPWLGTVDSSEVYSWLDSFLLL MLGGIPW QAY 257 

: I : I : I : I I : I I : I I : I I I 

Db 236 TSLTVSKDPAVGNISSTCYQPRPDSYHLLRDPVTGGLPWPALLLGLTIVSGWHWCSDQVI 295 



Qy 258 FQRVLS S S SAT YAQ VL S FLAAFGC LVMAI P AI L I GAI GAS TDWNQT AYGL P D P KTT 313 

| | | : : I : : : : I : I : I : : : : I I I 

Db 296 VQRCLAGKNLTHIKAGCILCGYLKLMPMFLMVMPGMI SRILY— PD 339 

Qy 314 EEADMILPIVLQYLC PVYISFFGLGAVSAAVMSSADSSILS 354 

| : : | | : : I I : I : I I : I I I I I 

Db 340 -EVACWPEVCKRVCGTEVGCSNIAYPRLWKLMPNGLRGLMLAVMLAALMSSLASIFNS 398 



Qy 355 ASSMFARNIYQLSFRQNASDKEIVWVMRITVFVFGASATAMALLTKTVYG LWYLSSD 411 

: I : : I : I I I I I : I : : I I : I I : I : : I hi 

Db 399 SSTLFTMDIY-TRLRPRAGDRELLLVGRLWWFIVAVSVAWLPWQAAQGGQLFDYIQSV 4 57 

Qy 412 LVYIV — IFPQLLCVLFVKGTNTYGAVAGYVSGLFLRI TGG— EP 452 

|: : :MI | | | I : I I : : II I 

Db 458 SSYLAPPVSAVFVLALFVPRVNEKGAFWGLIGGLLMGLARLIPEFFFGTGSCVRPSACPA 517 

Qy 453 YLYLQPLIFYPGYYPDDNGIYNQKFPFKTLAMVTSFLTNICISYLAKYLFESGT 506 

III : : I : MM:: I : I : : I 
Db 518 IFCRVHYLYFAIILFFCS GFLTLA- 1 SRCTAPI PQKHLHRLVFS 560 

Qy 507 L P P KL D VF D AWARH S E ENMD KT I L VKN EN I KL D E L 542 

I I I : I : h : : I I 

Db 561 LRHSKE EREDLDAEEL 576 



RESULT 3 
SL51_HUMAN 

ID SL51__HUMAN STANDARD; PRT; 664 AA. 

AC P13866; 

DT 01-JAN-1990 (Rel. 13, Created) 

DT 01-JAN-1990 (Rel. 13, Last sequence update) 

DT 15-MAR-2004 (Rel. 43, Last annotation update) 

DE Sodium/glucose cotransporter 1 (Na (+) /glucose cotransporter 1) 

DE (High affinity sodium-glucose cotransporter) . 

GN SLC5A1 OR SGLTl. 

OS Homo sapiens (Human) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 

OC Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo. 

OX NCBI_TaxID=9606; 

RN [1] 

RP SEQUENCE FROM N.A. 

RX MEDLINE= 89345544; PubMed=2 4 90366; 

RA Hediger M.A. , Turk E., Wright E.M.; 

RT "Homology of the human intestinal Na+/glucose and Escherichia coli 

RT Na+/proline cotransporters . " ; 

RL Proc. Natl. Acad. Sci. U.S.A. 86:5748-5752(1989). 

RN [2] 

RP SEQUENCE FROM N.A. 

RA Swan M. ; 

RL Submitted (JUN-1998) to the EMBL/ GenBank/DDBJ databases. 

RN [3] 

RP SEQUENCE FROM N.A. 

RX MEDLINE=20 057165; PubMed=10591208 ; 

RA Dunham I., Hunt A.R., Collins J.E., Bruskiewich R. , Beare D.M., 

RA Clamp M. , Smink L.J., Ainscough R. , Almeida J. P., Babbage A.K., 

RA Bagguley C, Bailey J., Barlow K.F., Bates K.N., Beasley O.P., 

RA Bird CP., Blakey S.E., Bridgeman A.M., Buck D., Burgess J. , 

RA Burrill W.D., Burton J., Carder C, Carter N.P., Chen Y. , Clark G. , 

RA Clegg S.M., Cobley V.E., Cole C.G., Collier R.E., Connor R. , 

RA Conroy D., Corby N.R., Coville G.J., Cox A.V. , Davis J. , Dawson E., 

RA Dhami P.D., Dockree C, Dodsworth S.J., Durbin R.M., Ellington A.G., 

RA Evans K.L., Fey J.M., Fleming K., French L., Garner A. A. , 

RA Gilbert J.G.R., Goward M.E., Grafham D.V., Griffiths M.N.D., Hall C, 

RA Hall R.E., Hall-Tamlyn G. , Heathcott R.W. , Ho S,, Holmes S., 

RA Hunt S.E., Jones M.C., Kershaw J . , Kimberley A.M., King A., 



RA Laird G.K., Langford C.F., Leversha M.A. , Lloyd C, Lloyd D.M., 

RA Martyn I.D., Mashreghi-Mohammadi M. , Matthews L.H., Mccann O.T., 

RA Mcclay J., Mclaren S., McMurray A. A. , Milne S.A., Mortimore B.J., 

RA Odell C.N., Pavitt R., Pearce A.V., Pearson D., Phillimore B.J.C.T., 

RA Phillips S.H., Plumb R.W., Ramsay H., Ramsey Y., Rogers L., Ross M.T., 

RA Scott C.E., Sehra H.K., Skuce CD., Smalley S., Smith M. L . , 

RA Soderlund C, Spragon L., Steward C.A., Sulston J.E., Swann R.M. , 

RA Vaudin M., Wall M., Wallis J.M., Whiteley M.N., Willey D.L., 

RA Williams L., Williams S.A., Williamson H., Wilmer T.E., Wilming L., 

RA Wright C.L., Hubbard T., Bentley D.R., Beck S., Rogers J,, Shimizu N., 

RA Minoshima S., Kawasaki K., Sasaki T., Asakawa S., Kudoh J., 

RA Shintani A., Shibuya K. , Yoshizaki Y., Aoki N., Mitsuyama S., 

RA Roe B.A., Chen F. , Chu L., Crabtree J., Deschamps S., Do A., Do T., 

RA Dorman A. , Fang F., Fu Y., Hu P., Hua A., Kenton S., Lai H., Lao H.I., 

RA Lewis J., Lewis S., Lin S.-P., Loh P., Malaj E., Nguyen T., Pan H., 

RA Phan S., Qi S., Qian Y., Ray L. , Ren Q. , Shaull S., Sloan D., Song L., 

RA Wang Q. , Wang Y. f Wang Z., White J. , Willingham D., Wu H., Yao Z., 

RA Zhan M. , Zhang G. , Chissoe S., Murray J., Miller N., Minx P., 

RA Fulton R., Johnson D., Bemis G., Bentley D., Bradshaw H., Bourne S., 

RA Cordes M., Du Z., Fulton L. , Goela D., Graves T., Hawkins J., 

RA Hinds K. , Kemp K., Latreille P., Layman D., Ozersky P., Rohlfing T., 

RA Scheet P., Walker C, Wamsley A. , Wohldmann P., Pepin K., Nelson J., 

RA Korf I., Bedell J. A., Hillier L.W., Mardis E. , Waterston R. , 

RA Wilson R., Emanuel B.S., Shaikh T., Kurahashi H., Saitta S., 

RA Budarf M.L., McDermid H.E., Johnson A., Wong A.C.C., Morrow B.E., 

RA Edelmann L. , Kim U.J., Shizuya H., Simon M. I., Dumanski J. P., 

RA Peyrard M., Kedra D., Seroussi E., Fransson I., Tapia I., Bruder C.E., 

RA O'Brien K.P., Wilkinson P., Bodenteich A., Hartman K. , Hu X., 

RA Khan A. S . , Lane L., Tilahun Y., Wright H.; 

RT "The DNA sequence of human chromosome 22."; 

RL Nature 402:489-4 95(1999). 

RN [4] 

RP VARIANT GGM ASN-28. 

RX MEDLINE-91179516; PubMed=2 0082 13 ; 

RA Turk E., Zabel B., Mundlos S., Dyer J . , Wright E.M. ; 

RT "Glucose/galactose malabsorption caused by a defect in the 

RT Na+/glucose cotransporter."; 

RL Nature 350:354-356(1991). 

RN [5] 

RP VARIANT GGM GLY-2 8. 

RX MEDLINE=94253082; PubMed=8195156; 

RA Turk E., Martin M.G., Wright E.M.; 

RT "Structure of the human Na+/glucose cotransporter gene SGLTl . " ; 

RL J. Biol. Chem. 269:15204-15209(1994). 

CC -!- FUNCTION: Actively transports glucose into cells by Na( + ) 

CC co-transport with a Na( + ) to glucose coupling ratio of 2:1. 

CC -!- FUNCTION: Efficient substrate transport in mammalian kidney is 

CC provided by the concerted action of a low affinity high capacity 

CC and a high affinity low capacity Na (+) /glucose cotransporter 

CC arranged in series along kidney proximal tubules. 

CC -!- SUBCELLULAR LOCATION: Integral membrane protein. 

CC -!- TISSUE SPECIFICITY: Expressed mainly in intestine and kidney. 

CC -!- DISEASE: Defects in SLC5A1 are the cause of congenital glucose- 

CC galactose malabsorption (GGM) [MIM: 606824 ] . GGM is an intestinal 

CC monosaccharide transporter deficiency. It is an autosomal 

CC recessive disorder manifesting itself within the first weeks of 

CC life. It is characterized by severe diarrhea and dehydration which 



cc 
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DR 
DR 
DR 
DR 
DR 
DR 
DR 
DR 
DR 
DR 
DR 
DR 
DR 
DR 
DR 
DR 
DR 
DR 
DR 
KW 
KW 



are usually fatal unless glucose and galactose are eliminated from 
the diet. 

-!- SIMILARITY: BELONGS TO THE SODIUM: SOLUTE SYMPORTER FAMILY (SSF) . 

This SWISS-PROT entry is copyright. It is produced through a collaboration 
between the Swiss Institute of Bioinf ormatics and the EMBL outstation - 
the European Bioinf ormatics Institute. There are no restrictions on its 
use by non-profit institutions as long as its content is in no way 
modified and this statement is not removed. Usage by and for commercial 
entities requires a license agreement (See http://www.isb-sib.ch/announce/ 
or send an email to license@isb-sib . ch) . 



EMBL 
EMBL 
EMBL 
EMBL 
EMBL 
EMBL 
EMBL 
EMBL 
EMBL 
EMBL 
EMBL 
EMBL 
EMBL 
EMBL 
EMBL 
EMBL 
EMBL 
EMBL 



L29339; AAB59448.1 

L29328; AAB59448.1 

L29330; AAB59448.1 

L29329; AAB59448.1 

L29331; AAB59448.1 

L29332; AAB59448.1 

L29333; AAB59448.1 

L29334; AAB59448.1 

L29335; AAB59448.1 

L29336; AAB59448.1 

L29337; AAB59448.1 

L29338; AAB59448.1 

M24847; AAA60320.1 
AL022321; -; NOT ANNOTATED CDS. 



JOINED. 
JOINED. 
JOINED. 
JOINED. 
JOINED. 
JOINED. 
JOINED. 
JOINED. 
JOINED. 
JOINED. 
JOINED. 



Z 8384 9; -; NOT_ANNOTATED_CDS . 
Z74021; -; NOT_ANNOTATED_CDS . 
Z 80998; -; NOT_ANNOTATED_CDS . 
Z 83839; -; NOT_ANNOTATED_CDS . 
PIR; A33545; A33545. 
Genew; HGNC: 11036; SLC5A1. 
MIM; 182380; -. 
MIM; 606824; -. 

GO; GO: 0005887; C: integral to plasma membrane; TAS . 

InterPro; I PRO 017 34; Na/solut_symport . 

Pfam; PF00474; SSF; 1. 

TIGRFAMs; TIGR00813; sss; 1. 

PROSITE; PS00456; NA_SOLUT_SYMP_l ; 1. 

PROSITE; PS00457; NA_SOLUT_SYMP_2 ; 1. 

PROSITE; PS50283; NA_SOLUT_SYMP_3 ; 1. 

Transport; Sugar transport; Transmembrane; Sodium transport; Symport; 
Glycoprotein; Disease mutation. 
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VARIANT 


28 
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D -> N (IN GGM) . 
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/FTId=VAR 007168. 


SQ 


SEQUENCE 


664 AA; 


73497 


MW; 2B403376595EAB74 CRC64; 



(BY SIMILARITY) 



Query Match 10.3%; Score 306; DB 1; Length 664; 

Best Local Similarity 22.8%; Pred. No. 4e-14; 

Matches 148; Conservative 104; Mismatches 218; Indels 178; Gaps 30; 

Qy 11 IIVFYLLILLVGIWAAWRTKNSGSAEERSEAIIVGGRDIGLLVGGFTMTATWVGGGYING 70 

I :::::::: I I : I I : I I I : : I I : I : : I : : I I : I 

Db 32 I VI Y FVWMAVG LWAMF S T - N RGT V GGFFLAGRSMVWWPIGASLFASNIGSGHFVG 86 

Qy 71 TAEAVYVPGYGLAWAQAP I G YS LSLILGGLFFAKPMRSK-GYVTMLDPFQQI YGK 124 

I III II: I : : I I I I I : I I I I I : I 

Db 87 LA GTGAAS GI AI GG FEWNALVL VWLGWLFV — PIYIKAGWTM PEYLRK 134 

Qy 125 RMGG LLFI PALMGEMFWAAAI FSALGATI SVI I DVDMHI SVI I SAL I A 172 

Ml I I : I : : : I I I | :::::::::: : I 

Db 135 RFGGQRIQVYLSLLSLLLYI FTKI SADI FSGAI F INLALGLNLYLAI FLLLAIT 188 

Qy 173 T L YT LVGGL Y S VAYT D WQL FC I FVGLWI S VP FAL S H P AVAD I G FT AVHAK YQ K — PWL- 229 

I I I : I I I : I I I I : I : I I I II hi Mil: 

Db 189 ALYT I TGGLAAVI YTDT LQTVIMLVGS LI LTGFAFHEVG GYDAFMEKYMKAI PTI V 244 

Qy 230 GTVDSSEVYS-WLDSFLLL MLGGIPW QAYFQRVLSS 264 

I : I : I I I : : I : I I I II II: 

Db 245 SDGNTTFQEKCYTPRADSFHIFRDPLTGDLPWPGFIFGMSILTLWYWCTDQVIVQRCLSA 304 

Qy 265 SSATYAQ VL S FLAAFGCLVMAI PAI L IGAI GASTDWNQT 303 

I 1 I ■ * 2 | I | I | t I | 2 I 

Db 305 KNMSHVKGGCILCGYLKLMPMFIMVMPGMISRILYTEKIACWPSECEKYCGTKVGCTNI 364 

Qy 304 AYGLPDPKTTEEADMI LPI VLQYLCPVYI S FFGLGAVSAAVMS SADS S I LSAS SMFARNI 363 

II | : : | | : : | : : || I I I I I : : I : I 

Db 365 AY PT L WE LMPNGL RGLML S VMLAS LMS S LT S I FN S AS T L FTMD I 409 

Qy 364 YQLS FRQNAS DKEI VWVMRI TVFV- FGASATAMALLTKTVYG- - LW YLS S DLVYI — VI F 418 

I I : I I : I I : : I : : I I I : : : I I : I I : I 

Db 410 Y-AKVRKRASEKELMIAGRLFILVLIGISIAWVPIVQSAQSGQLFDYIQSITSYLGPPIA 468 

Qy 419 PQLLCVLFVKGTNTYGAVAGYVSGLFLRI TG GEPYLY 455 

I : I I I I I I : I I : I II I I I I 

Db 469 AVFLLAIFWKRVNEPGAFWGLILGLLIGISRMITEFAYGTGSCMEPSNCPTIICGVHYLY 528 

Qy 456 LQPLIFYPGYYPDDNGIYNQKFPFKTLAMVTSFLTNICISYLAKYLFESGTLPPKLDVFD 515 

: : I | | : | : I I | | : I : : : 



Db 529 FAIILF AI SFITIWISLLTKPI PDVHLYR 558 

Qy 516 AV- -VARHS EENMDKTI LVKNENI KLDELALVKPRQSMTLS STFTNKE 561 

: II : I : • III: | : : : : : : | : 

Db 559 LCWSLRNSKEERID — LDAEEENIQ EGPKETIEIETQVPEKK 598 

RESULT 4 
SL51_RAT 

ID SL51_RAT STANDARD; PRT; 665 AA. 

AC P53790; P97787; 

DT 01-OCT-1996 (Rel. 34, Created) 

DT 01-OCT-1996 (Rel. 34, Last sequence update) 

DT 10-OCT-2 003 (Rel. 42, Last annotation update) 

DE Sodium/glucose cotransporter 1 (Na (+) /glucose cotransporter 1) 

DE (High affinity sodium-glucose cotransporter) . 

GN SLC5A1 OR SGLT1. 

OS Rattus norvegicus (Rat) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 

OC Mammalia; Eutheria; Rodentia; Sciurognathi ; Muridae; Murinae; Rattus. 

OX NCBIJTaxID=10116; 

RN [1] 

RP SEQUENCE FROM N.A. 

RC STRAIN=Sprague-Dawley; TISSUE=Kidney; 

RX MEDLINE=94216314; PubMed=8 163506 ; 

RA Lee W.S., Kanai Y . , Wells R.G., Hediger M.A. ; 

RT "The high affinity Na+/glucose cotransporter. Re-evaluation of 

RT function and distribution of expression."; 

RL J. Biol. Chem. 269:12032-12039(1994). 

RN [2] 

RP SEQUENCE FROM N.A. 

RA Kasahara M. , Mori K. ; 

RL Submitted (MAY-1993) to the EMBL/ GenBank/DDBJ databases. 

RN [3] 

RP SEQUENCE FROM N.A. 

RC STRAIN=Sprague-Dawley; TISSUE= Jejunum; 

RA Aoshima H., Yokoyama T., Tanizaki J., Izu H., Yamada M. ; 

RT "The sugar specificity of Na/glucose cotransporter from rat jejunum."; 

RL Submitted (JAN-1997) to the EMBL/ GenBank/DDBJ databases. 

CC -!- FUNCTION: Actively transports glucose into cells by Na(+) co- 

CC transport with a Na{+) to glucose coupling ratio of 2:1. 

CC -!- FUNCTION: Efficient substrate transport in mammalian kidney is 

CC provided by the concerted action of a low affinity high capacity 

CC and a high affinity low capacity Na (+) /glucose cotransporter 

CC arranged in series along kidney proximal tubules . 

CC -!- SUBCELLULAR LOCATION: Integral membrane protein. 

CC -!- DEVELOPMENTAL STAGE: Appears on embryonic day 18 and gradually 

CC increases until birth. 

CC -!- SIMILARITY: BELONGS TO THE SODIUM: SOLUTE SYMPORTER FAMILY (SSF). 

CC 

CC This SWISS-PROT entry is copyright. It is produced through a collaboration 

CC between the Swiss Institute of Bioinf ormatics and the EMBL outstation - 

CC the European Bioinf ormatics Institute. There are no restrictions on its 

CC use by non-profit institutions as long as its content is in no way 

CC modified and this statement is not removed. Usage by and for commercial 

CC entities requires a license agreement (See http://www.isb-sib.ch/announce/ 

CC or send an email to license@isb-sib . ch) . 



cc 

DR EMBL; U03120; AAA19015.1; -. 

DR EMBL; D16101; BAA03676.1; -. 

DR EMBL; AB000729; BAA19172.1; -. 

DR PIR; A53582; A53582. 

DR InterPro; IPR001734; Na/solut_symport . 

DR Pfam; PF00474; SSF; 1. 

DR TIGRFAMs ; TIGR00813; sss; 1. 

DR PROSITE; PS00456; NA_S0LUT_SYMP_1 ; 1. 

DR PROSITE; PS00457; NA_S0LUTJ3YMP_2 ; 1. 

DR PROSITE; PS50283; NA__S0LUT_SYMP_3 ; 1. 

KW Transport; Sugar transport; Transmembrane; Sodium transport; Symport; 

KW Glycoprotein. 
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N-LINKED (GLCNAC. . .) (POTENTIAL) 


FT 


CONFLICT 


354 


354 


Y -> H (IN REF. 3) . 


SQ 


SEQUENCE 


665 AA; 


73066 


MW; A92038D964BFF061 CRC64; 


Query Match 




10.3^ 


h; Score 306; DB 1; Length 665; 



Best Local Similarity 23.5%; Pred. No. 4e-14; 
Matches 155; Conservative 105; Mismatches 242; Indels 158; Gaps 29, 

Qy 11 HVFYLLILLVGIWAAWRTKNSGSAEERSEAIIVGGRDIGLLVGGFTMTATWVGGGYING 7 0 

| :::::::: I I : I I : I I I : : I I : I : : I : : I I : I 

D b 32 I VI Y FVWMAVGL WAMF S T - N RGT V GGFFLAGRSMVWWPIGASLFASNIGSGHFVG 8 6 

Q y 71 T AEAVYVP G YG LAWAQAP I G Y S L S LILGGLFFAKPMRSK-GYVTMLDPFQQIYGK 124 

| Ml ||:: : : I I I I I : I I I I I = I 

D b 87 LA GT GAAAGI AMGGFEWNALVFWVLGWL FV — P I YI KAGWTM PEYLRK 134 

Qy 125 RMGG LLFI PALMGEMFWAAAI FSALGATI SVI IDVDMHI SVI I SAL I A 172 

Ml | | : | : : : I I I I : : : : | : : : : : I I 

Db 135 RFGGKRIQIYLSVLSLLLYIFTKISADIFSGAIF INLALGLDIYLAIFILLAIT 188 

Qy 173 TLYTLVGGLYSVAYTDWQLFCIFVGLWISVPFALSHPAVADIGFTAVHAKYQK— PWL- 229 

I | | : I I I : I I I I : I : I I : I I I hi I I I II 



Db 



189 ALYTITGGLAAVI YTDTLQTAIMLVGSFILTGFAFREVG GYEAFMDKYMKAIPTLV 244 



Qy 230 — GTVD-SSEVYS-WLDSFLLL MLGGIPW -QAYFQRVLSS 264 

I : I I : I I I : : I : I I I 1111 = 

Db 245 SDGNITVKEECYTPRADSFHIFRDPITGDMPWPGLIFGLSILALWYWCTDQVIVQRCLSA 304 

Qy 265 S SAT YAQVLS FLAAFGCLVMAI PAI LI GAI GASTDWNQTAYGLPDP KTTEEADM 318 

: :: : I : I : :: I I :: I I I :: 

Db 305 KNMSHVKAGCTLCGYLKLLPMFLMVMPGMISRILYTDKIACVLPSECKKYCGTPVGCTNI 364 

Qy 319 I L P I VLQ YLC P VY I S FFGLGAVS AAVMS SAD S S I L S AS SMFARN I YQL S FRQNAS DKE I V 378 

I : : II : I : I : : I I I I I I I : : I : I I I : I I : I I : : 
Db 365 AYPTLWELMPNGLRGLMLSVMMASLMSSLTSIFNSASTLFTMDIY-TKIRKGASEKELM 423 

Qy 379 WVMRITVFV- FGASATAMALLTKTVYG — LWYLSSDLVYI — VI FPQLLCVLFVKGTNTY 433 

I : : I I I : : : I I : I I : I I : I I I 

Db 424 IAGRLFILVLIGISIAWVPIVQSAQSGQLFDYIQSITSYLGPPIAAVFLLAI FCKRVNEP 483 

Qy 4 34 GAVAGYVSGLFLRI TG GEPYLYLQPLIFYPGYYPDDN 470 

I I I : I : I II I I I I : : I 
Db 484 GAFWGLILGFLIGISRMITEFAYGTGSCMEPSNCPKIICGVHYLYFAIILF 534 

Qy 471 GIYNQKFPFKT LAMVT SFLTNICIS YLAK Y L FE S.GT L P P KL D VFD AV- - VARH S EENMD K 528 

I : I : II I I : I : : : : : I I : I 

Db 535 AISWTVLVISLLTKPI PDVHLYRLCWSLRNSTEERID- 572 

Qy 529 TILVKNENIKLDELALVKPRQSMTLSSTFTNKE AFLDVDSSPEGSGTED 577 

I I : : I I : : : : : II III h : h 

Db 573 — LDAGEEEPVEE DPKDTIEIDAEAPQKEKGCFRKAYDLFCGLDQDKGPKMTKEEE 626 



RESULT 5 
SL54_PIG 

ID SL54_PIG STANDARD; PRT; 660 AA. 

AC P31636; 

DT Ol-JUL-1993 (Rel. 26, Created) 

DT Ol-JUL-1993 (Rel. 26, Last sequence update) 

DT 16-OCT-2001 (Rel. 40, Last annotation update) 

DE Low affinity sodium-glucose cotransporter ( Sodium/ glucose 

DE cotransporter 3) (Na (+) /glucose cotransporter 3). 

GN SLC5A4 OR SGLT3 OR SAAT1. 

OS Sus scrofa (Pig) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleos tomi ; 

OC Mammalia; Eutheria; Cetartiodactyla ; Suina; Suidae; Sus. 

OX NCBI_TaxID=9823; 

RN [1] 

RP SEQUENCE FROM N.A. 

RC TISSUE-Kidney; 

RX MEDLINE=93131881; PubMed=8420925 ; 

RA Kong C.-T., Yet S.-F., Lever J.E.; 

RT "Cloning and expression of a mammalian Na+/amino acid cotransporter 

RT with sequence similarity to Na+/glucose cotransporters . "; 

RL J. Biol. Chem. 268:1509-1512(1993). 

RN [2] 

RP FUNCTION. 

RX MEDLINE=94357885; PubMed=8077 195 ; 

RA McKenzie B., Panayotova-Heiermann M. , Loo D.D.F., Lever J.E., 



RA Wright E.M. ; 

RT "SAAT1 is a low affinity Na+/glucose cotransporter and not an amino 
RT acid transporter. A reinterpretation . " ; 
RL J. Biol. Chem. 269:22488-22491(1994). 

CC -!- FUNCTION: Sodium-dependent glucose transporter. Has a Na+ to 
CC glucose coupling ratio of 1:1. 

CC -!- SUBCELLULAR LOCATION: Integral membrane protein. 

CC -!- TISSUE SPECIFICITY: KIDNEY, INTESTINE, LIVER, SKELETAL MUSCLE, 
CC AND SPLEEN. 

CC -!- SIMILARITY: BELONGS TO THE SODIUM: SOLUTE SYMPORTER FAMILY (SSF) . 
CC -!- CAUTION: Was originally (Ref.l) thought to be a sodium/ neutral 
CC amino acid cotransporter (system a neutral amino acid transporter) 

CC responsible for the sodium-dependent intake of neutral amino acids 

CC such as alanine, glycine, serine, cysteine, and proline. 

CC 7"" 

CC This SWISS-PROT entry is copyright. It is produced through a collaboration 
CC between the Swiss Institute of Bioinf ormatics and the EMBL outstation - 
CC the European Bioinf ormatics Institute. There are no restrictions on its 
CC use by non-profit institutions as long as its content is in no way 
CC modified and this statement is not removed. Usage by and for commercial 
CC entities requires a license agreement (See http://www.isb-sib.ch/announce/ 
CC or send an email to license@isb-sib . ch) . 

cc 

DR EMBL; L02900; AAC37325.1; -. 
DR PIR; A44432; A44432. 

DR InterPro; IPR001734; Na/solutsymport . 

DR Pfam; PF00474; SSF; 1. 

DR TIGRFAMs; TIGR00813; sss; 1. 

DR PROSITE; PS00456; NA_S0LUT_SYMP_1 ; 1. 

DR PROSITE; PS00457; NA_SOLUT_SYMP__2 ; 1. 

DR PROSITE; PS50283; NA_SOLUT_SYMP_3 ; 1. 

KW Transport; Sugar transport; Transmembrane; Sodium transport; Symport; 
KW 
FT 
FT 
FT 
FT 
FT 
FT 
FT 
FT 
FT 
FT 
FT 
FT 
FT 
FT 
FT 
FT 
FT 
FT 
FT 
FT 
FT 
FT 
FT 
SQ 
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N-LINKED (GLCNAC. . .) (PO 


SEQUENCE 


660 AA; 


72745 


MW; 38616367F8F18F1A CRC64; 



Query Match 10.2%; Score 303.5; DB 1; Length 660; 

Best Local Similarity 23.2%; Pred. No. 5.9e-14; 

Matches 141; Conservative 103; Mismatches 230; Indels 135; Gaps 



26; 



Qy 11 HVFYLLILLVGIWAAWRTKNSGSAEERSEAIIVGGRDIGLLVGGFTMTATWVGGGYING 70 

| :::::::: I I : I I II I h : I I I : | : : | : : I I : I 

Db 32 I VI Y FVWMAVGLWAML RT - N RGT V GGFFLAGRDVTWWPMGASLFASNIGSGHFVG 86 

Q y 71 TAEAVYVPGYGLA WAQAP I G Y S L S L I LGGL F FAK PMRS KG YVTML D P FQQ I Y- GKRM 126 

| | : | I | | : : | | | ' I : I I : : : : I I I : 

Db 87 LAGTGAASGIAIAAFEW NALLLLLVLGWFFVPI YIKAGVMTMPEYLRKRFGGKRL 141 

Qy 127 GGLLFIPAL MGEMFWAAAI FS ALGAT I S VI I DVDMH I S VI I SAL I AT L YT LVG 179 

| | : | : : : I I I | : : : | : : : : : I : I I : I 

D b 142 QIYLSILSLFICVALRISSDIFSGAIF IKLALGLDLYLAIFSLLAITAIYTITG 195 

Qy 180 GLYSVAYTDWQLFCIFVGLWISVPFALSHPAVADIGFTAVHAKYQK--PWLGTVD 233 

I I I I I I I : I : : I : I : I I I I : : II I • I 

D b 196 G LAS VI YT DT LQT I IML I G S FI LMG FAF VEVGGYESFTEKYMNAIPTIVEGDNLTI 251 

Qy 234 SSEVYS-WLDSFLLL— MLGGIPW QAYFQRVLS S S SAT YAQ 271 

I : I : III: I I I I I I I I I : : : 

Db 252 S PKCYTPQGDS FHI FRDAVTGDI PWPGMI FGMTVVAAWYWCTDQVI VQRCLSGKDMSHVK 311 

Q y 272 VL S FLAAFGCLVMAI P AI L I GAI GAS T DWNQT AYGL P D P KT TEE— ADMILPIVLQ 325 

: : | : : : I I : I : I II = : I — 

Db 312 AACIMCGYLKLLPMFLMVMPGMISRILYTEKVACWPSECVKHCGTEVGCSNYAYPLLVM 371 

Qy 326 YLCPWISFFGLGAVSAAVTXISSADSSILSASSMFARNIYQLSFRQNASDKEIVI^mRI 385 

M : I : I : : I I I I I I I : : I : : I I : I I : I I : : I : : 

Db 372 E LMP S G L RGLML S VMLAS LMS S LT S I FN S AS T L FTMD L Y - T K I RKQ AS E K E L L I AGRL F I 430 

Qy 386 FVFGASATAMALLTKTVYG LWYLSSDLVYI — VIFPQLLCVLFVKGTNTYGA V 436 

: : I : I : I I : I I : I I I I I : 

Db 431 I LLIVI SIVWVPLVQVAQNGQLFHYI ESI SSYLGPPIAAVFLLAI FCKRVNEQGAFWGLI 490 

Qy 437 AGYVSGL FLRITG GEPYLYLQPLI FYPGYYPDDNGI YNQKF 477 

I : I II I: II I Ml Is 
Db 491 IGFVMGLIRMIAEFVYGTGSCLAASNCPQIICGVHYLYFALILFF 535 



QY 



478 PFKTLAMVTSFLTNICISYLAK YLFE S GT L P P KL D VFDAWARH 521 

| | : I I I I : I : s s : I : I I I I 

Db 536 VSILWLAISLLTKPIPDVHLYRLCWALRNSTEERIDL-DAEEKRHEEAHDG 586 



Qy 522 -SEENMDKT 529 

1:1 ss I 

Db 587 VDEDNPEET 595 



RESULT 6 
SL52_RABIT 

ID SL52_RABIT STANDARD; PRT; 672 AA. 

AC P26430; 

DT 01-AUG-1992 (Rel. 23, Created) 

DT 01-AUG-1992 (Rel. 23, Last sequence update) 

DT 15-MAR-2004 (Rel. 43, Last annotation update) 



DE Sodium/nucleoside cotransporter (Na (+) /nucleoside cotransporter) . 

GN SLC5A2 OR SNST1. 

OS Oryctolagus cuniculus (Rabbit) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 

OC Mammalia; Eutheria; Lagomorpha; Leporidae; Oryctolagus. 

OX NCBI_TaxID=9986; 

RN [1] 

RP SEQUENCE FROM N. A. 

RC TISSUE=Kidney; 

RX MEDLINE=92156077; PubMed=1740408 ; 

RA Pajor A.M., Wright E.M.; 

RT "Cloning and functional expression of a mammalian Na+/nucleoside 

RT cotransporter. A member of the SGLT family."; 

RL J. Biol. Chem. 267:3557-3560(1992). 

cc _i_ FUNCTION: Actively transports uridine into cells by Na+ 

CC co-transport. May play a role in reabsorption of nucleosides from 

CC glomerular filtrate by the proximal tubule in kidney, and in the 

CC regulation of cardiac contractility by adenosine. 

CC -!- SUBCELLULAR LOCATION: Integral membrane protein. 

CC -!- TISSUE SPECIFICITY: More abundant in heart than in kidney, where 
CC it is absent from the outer cortex. 

CC -!- SIMILARITY: BELONGS TO THE SODIUM: SOLUTE SYMPORTER FAMILY (SSF). 

CC 7"" 

CC This SWISS-PROT entry is copyright. It is produced through a collaboration 

CC between the Swiss Institute of Bioinf ormatics and the EMBL outstation - 

CC the European Bioinf ormatics Institute. There are no restrictions on its 

CC use by non-profit institutions as long as its content is in no way 

CC modified and this statement is not removed. Usage by and for commercial 

CC entities requires a license agreement (See http://www.isb-sib.ch/announce/ 

CC or send an email to license@isb-sib . ch) . 

CC 

DR EMBL; M84020; AAA31421.1; -. 

DR PIR; A42251; A42251. 

DR InterPro; IPR001734; Na/solut_symport . 

DR Pfam; PF00474; SSF; 1. 

DR TIGRFAMs; TIGR00813; sss; 1. 

DR PROSITE; PS00456; NA_S0LUT_SYMP_1 ; 1. 

DR PROSITE; PS00457; NA_S0LUT_SYMP_2 ; 1. 

DR PROSITE; PS50283; NA_S0LUT_SYMP_3 ; 1. 

KW Transport; Transmembrane; Sodium transport; Symport; Glycoprotein. 
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FT 
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POTENTIAL. 
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nvmn T\ /"*• T7T T ITT T\ n / T*i/~\m XT' XT m TUT \ 
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546 


POTENTIAL. 


FT 


DOMAIN 


547 


650 


CYTOPLASMIC (POTENTIAL) . 


FT 


TRANSMEM 
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671 


POTENTIAL. 


FT 


CARBOHYD 


250 


250 


N-LINKED (GLCNAC. . .) (POTENTIAL) 


SQ 


SEQUENCE 


672 AA; 


73161 


MW; E2D987B03B9C57B4 CRC64; 



Query Match 10.0%; Score 298; DB 1; Length 672; 

Best Local Similarity 25.0%; Pred. No. 1.5e-13; 

Matches 153; Conservative 89; Mismatches 232; Indels 138; Gaps 25; 

Qy 9 IAII-VFYLLILLVGIWAAWRTKNSGSAEERSEAIIVGGRDIGLLVGGFTMTATWVGGGY 67 

I I: I ::||:: ||:|: II I |: : II : I :: |: :l I: 

Db 26 IAVIAAYFLLVIGVGLWSMCRT-NRGTV GGYFLAGRSMVWWPVGASLFASNIGSGH 80 



Qy 68 INGTAEAVYVPGYGLAWAQAPIGYSLS LILGGLFFAKPMRSKGYVTMLDPFQQI YG 123 

II III II:: : : I I I I : I : I I I 
Db 81 FVGLA GT GAAN GLAVAG FEWNALFWLL LGW L FAP VYLT AGVI TM PQYLR 130 

Qy 124 KRMGG : LLFI PALMGEMFWAAAI F- - SALGATI SVI I DVDMHI SVI I SA 169 

| | M I : I : : : | : I I I I I : I I I 

Db 131 KRFGGHRIRLYLSVLSLFLYIFTKISVDMFSGAVFIQQALGWNI YASVIALL 182 

Qy 170 L I AT L YT LVG GL Y S VAYT DWQ L FC I FVGLW I S VP FAL S H P AVAD I G FT AVHAK Y 224 

I : I I : I I I :: I I I I I I I I : I : I I : : : I I 

Db 183 GITMVTTVTGGLAALMYTDTVQTFVIIAGAFILTGYAFHEVG GYSGLFDKYMGAMT 238 

Qy 225 QKPWLGTVDSSEVYSWLDSFLLL MLGGIPW QAYF 258 

: I : I : I I I I : I I : I : I I I 

Db 239 SLTVSEDPAVGNISSSCYRPRPDSYHLLRDPVTGDLPWPALLLGLTIVSGWYWCSDQVIV 298 

Qy 259 Q RVL S S S SAT YAQVLS FLAAFGC LVMAI P AI L I GAI GAS T DWNQT AYGL P D P KT TE 314 

| | | : : I : : I : I :: I I :: I I : I I 

Db 299 Q RC LAG RN LT H I KAGC ILCGYLKLTPMF LMVM P GM ISRILYPD E VAC VAP E VC KRVC GTE 358 

Qy 315 E — ADMI LPI VLQYLCPVYI S FFGLGAVSAAVMS SADS S I LSAS SMFARNI YQLS FRQNA 372 

: : : | : : I I : I : I I : I I I I I : I : : I : I I I I I 

Db 359 VGCSNIAYPRLWKLMPNGLRGLMLAVMLAALMSSLASIFNSSSTLFTMDIYTL--RPRA 416 

Qy 373 S DKEI WVMRI TVFVFGASATAMALLT KTVYG LWYLSSDLVYIV — IFPQLLCVLFV 427 

: I : : I I : I I : I : : I I : I I ' ' : Ml 

Db 417 GEGELLLVGRLWWFIVAVSVAWLPWQAAQGGQLFDYIQSVSSYLAPPVSAVFWALFV 476 

Qy 428 KGTNTYGAVAGYVSGLFLRITGGEPYLYLQPLIFYPGYYPDDNGIYNQKFPFKTLAMV — 485 

I I I I : I I : : I : I I I : I 

Db 477 PRVNEKGAFWGLI GGLLMGLARLI P EFSFGTGSCVRP 513 

Qy 4 86 TSFLTNICISYLAKYLFE-SG T L P - P KL DVFDAWA- RH S EENMDKT I 530 

: I I : I I I I I I |||:: I : I I I : I 
Db 514 SACPAFLCRVHYLYFAIVLFFCSGLLIIIVSLCTAPIPRKHLHRLVFSLRHSKE 567 

Qy 531 LVKNENI KLDEL 542 

: I : : III 
Db 568 — EREDLDADEL 577 



RESULT 7 
SL54_HUMAN 

ID SL54_HUMAN STANDARD; PRT; 659 AA. 

AC Q9NY91; 015279; 

DT 16-OCT-2001 (Rel. 40, Created) 

DT 16-OCT-2001 (Rel. 40, Last sequence update) 

DT 15-MAR-2004 (Rel. 43, Last annotation update) 

DE Low affinity sodium- glucose cotransporter (Sodium/glucose 

DE cotransporter 3) (Na (+) /glucose cotransporter 3). 

GN SLC5A4 OR SAAT1 OR SGLT2 . 

OS Homo sapiens (Human) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 

OC Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo. 

OX NCBI_TaxID=9606; 

RN [1] 

RP SEQUENCE FROM N.A. 

RC TISSUE=Small intestine; 

RA Gorboulev V., Baumgarten K., Veyhl M. , Koepsell H. ; 

RT "The molecular cloning and functional characterization of the human 

RT SGLT2 transporter."; 

RL Submitted (FEB-2000) to the EMBL/ GenBank/DDBJ databases. 

RN [2] 

RP SEQUENCE FROM N.A. 

RX MEDLINE=20057165; PubMed-10591208 ; 

RA Dunham I., Hunt A.R., Collins J.E., Bruskiewich R., Beare D.M. , 

RA Clamp M. , Smink L.J., Ainscough R., Almeida J. P., Babbage A.K., 

RA Bagguley C. , Bailey J., Barlow K.F., Bates K.N., Beasley O.P., 

RA Bird CP., Blakey S.E., Bridgeman A.M., Buck D., Burgess J., 

RA Burrill W.D., Burton J., Carder C, Carter N.P., Chen Y., Clark G., 

RA Clegg S.M., Cobley V.E., Cole C.G., Collier R.E., Connor R. , 

RA Conroy D., Corby N.R., Coville G.J., Cox A.V., Davis J., Dawson E . , 

RA Dhami P.D., Dockree C, Dodsworth S.J., Durbin R.M. , Ellington A.G., 

RA Evans K.L., Fey J.M., Fleming K., French L., Garner A. A. , 

RA Gilbert J.G.R., Goward M.E., Grafham D.V., Griffiths M.N.D., Hall C, 

RA Hall R.E., Hall-Tamlyn G. , Heathcott R.W., Ho S . , Holmes S., 

RA Hunt S.E., Jones M.C., Kershaw J., Kimberley A.M., King A., 

RA Laird G.K., Langford C.F., Leversha M.A. , Lloyd C, Lloyd D.M., 

RA Martyn I.D., Mashreghi-Mohammadi M. , Matthews L.H., Mccann O.T., 

RA Mcclay J., Mclaren S., McMurray A. A. , Milne S.A., Mortimore B.J., 

RA Odell C.N,, Pavitt R. , Pearce A.V., Pearson D., Phillimore B.J.C.T., 

RA Phillips S.H., Plumb R.W., Ramsay H., Ramsey Y., Rogers L., Ross M.T 

RA Scott C.E., Sehra H.K., Skuce CD., Smalley S., Smith M.L., 

RA Soderlund C, Spragon L., Steward C.A., Sulston J.E., Swann R.M. , 

RA Vaudin M. , Wall M. , Wallis J.M., Whiteley M.N., Willey D.L., 

RA Williams L., Williams S.A., Williamson H., Wilmer T.E., Wilming L., 

RA Wright C.L., Hubbard T., Bentley D.R., Beck S., Rogers J., Shimizu N 

RA Minoshima S., Kawasaki K., Sasaki T., Asakawa S., Kudoh J., 

RA Shintani A., Shibuya K. , Yoshizaki Y., Aoki N., Mitsuyama S., 

RA Roe B.A., Chen F. , Chu L., Crabtree J., Deschamps S., Do A., Do T., 

RA Dorman A., Fang F. , Fu Y., Hu P., Hua A., Kenton S., Lai H. , Lao H.I 

RA Lewis J., Lewis S., Lin S.-P., Loh P., Malaj E., Nguyen T., Pan H., 

RA Phan S., Qi S., Qian Y. , Ray L., Ren Q. , Shaull S., Sloan D., Song L 

RA Wang Q. , Wang Y. , Wang Z., White J., Willingham D., Wu H., Yao Z., 

RA Zhan M. , Zhang G. , Chissoe S . , Murray J., Miller N., Minx P., 

RA Fulton R. , Johnson D., Bemis G., Bentley D. , Bradshaw H., Bourne S., 

RA Cordes M. , Du Z., Fulton L . , Goela D . , Graves T., Hawkins J., 

RA Hinds K., Kemp K. , Latreille P., Layman D., Ozersky P., Rohlfing T. , 



RA Scheet P., Walker C, Wamsley A., Wohldmann P., Pepin K. , Nelson J,, 

RA Korf I., Bedell J. A., Hillier L.W., Mardis E . , Waterston R. , 

RA Wilson R., Emanuel B.S., Shaikh T., Kurahashi H., Saitta S., 

RA Budarf M.L., McDermid H.E., Johnson A., Wong A.C.C., Morrow B.E., 

RA Edelmann L., Kim U.J., Shizuya H., Simon M. I . , Dumanski J. P., 

RA Peyrard M. , Kedra D., Seroussi E., Fransson I., Tapia I., Bruder C.E., 

RA O'Brien K.P., Wilkinson P., Bodenteich A., Hartman K., Hu X., 

RA Khan A.S., Lane L., Tilahun Y., Wright H,; 

RT "The DNA sequence of human chromosome 22."; 

RL Nature 402:4 89-495(1999). 

RN [3] 

RP SEQUENCE OF 73-247 FROM N . A. 

RC TISSUE=Brain; 

RA Poppe R . r Koepsell H.; 

RL Submitted (DEC-1995) to the EMBL/ GenBank/ DDB J databases. 

CC -!- FUNCTION: Sodium-dependent glucose transporter (By similarity). 

CC -!- SUBCELLULAR LOCATION: Integral membrane protein. 

CC -!- SIMILARITY: BELONGS TO THE SODIUM: SOLUTE SYMPORTER FAMILY (SSF). 

CC 

CC This SWISS-PROT entry is copyright. It is produced through a collaboration 

CC between the Swiss Institute of Bioinf ormatics and the EMBL outstation - 

CC the European Bioinf ormatics Institute. There are no restrictions on its 

CC use by non-profit institutions as long as its content is in no way 

CC modified and this statement is not removed. Usage by and for commercial 

CC entities requires a license agreement (See http://www.isb-sib.ch/announce/ 

CC or send an email to license@isb-sib . ch) . 

CC 

DR EMBL; AJ133127; CAB81772.1; -. 



DR 


EMBL; AL008723; CAB51758.1; 




DR 


EMBL; U41 


897; AAB61732.1; - 




DR 


Genew; HGNC: 11039; SLC5A4 . 




DR 


InterPro; 


IPR001734; Na/solut_symport . 


DR 


Pfam; PF00474; SSF; 1. 




DR 


TIGRFAMs; 


TIGR00813; sss; 1 




DR 


PROSITE; 


PS00456; 


NA_SOLUT_ 


SYMP1 ; 1 . 


DR 


PROSITE; 


PS00457; 


NA_SOLUT_ 


SYMP_2 ; 1 . 


DR 


PROSITE; 


PS50283; 


NA_SOLUT_ 


SYMP_3; 1. 


KW 


Transport 


; Sugar 


transport; 


Transmembrane; Sodium trans] 


KW 


Glycoprotein . 
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FT 


TRANSMEM 


527 


547 


POTENTIAL. 


FT 


DOMAIN 


548 


637 


CYTOPLASMIC (POTENTIAL) . 
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FT 


CARBOHYD 


248 


248 


N-LINKED (GLCNAC. . .) (POTENTIAL) 


FT 


CONFLICT 


76 


76 


A -> V (IN REF. 3) . 


FT 


CONFLICT 


106 


106 


S -> P (IN REF. 3) . 


FT 


CONFLICT 


243 


243 


V ~> I (IN REF. 3) . 


SQ 


SEQUENCE 


659 AA; 


72455 


MW; F8A34AED648B523A CRC64; 



Query Match 9.9%; Score 294; DB 1; Length 659; 

Best Local Similarity 22.5%; Pred. No. 2.7e-13; 

Matches 135; Conservative 97; Mismatches 225; Indels 144; Gaps 22; 

Qy 11 IIVFYLLILLVGIWAAWRTKNSGS7VEERSEAIIVGGRDIGLLVGGFTMTATWVGGGYING 70 

I : : : : I : : : I I : I I : I I I : : I I I : | : : | : : | : | 

Db 32 I VI YFLWMAVGLWAMLKT-NRGTI G G FFLAGRDMAWWPMGAS L FAS NIG S NH YVG 86 

Qy 71 T AEAVYVP G Y G LAWAQAP I G Y S LSLILGGLFFAKPMRSKGYVTMLDPFQQIYGKR 125 

I III I : : 1111 : 1 : : I I : I I : II 

Db 87 LA GT GAAS GVAT VT FEWT S S VML L I L GW I FVP I Y I K S - GVMTM PEYLKKR 135 

Qy 126 MGG LLFI PALMGEMFWAAAI FSALGATI SVI I DVDMHI SVI I SAL I AT 173 

II : : I : : I II I I : : : I : : : : : I : 

Db 136 FGGERLQVYLS I LSLFI CWLLI SADI FAGAI F 1 KLALGLDLYLAI FI LLAMTA 189 

Qy 174 L YT LVGGL Y S VAYT D WQL FC I FVGLW I S VP FAL S H P AVAD I G FTAVHAKYQ K PWLGT VD 233 

: I I I I I I I I I I : I : : I : I : I I : I : : II I : 

Db 190 VYTTTGGLASVIYTDTLQTIIMLIGSFILMGFAFNEVG GYESFTEKYVNATPSWE 245 

Qy 234 SSEVYS-WLDSFLLL MLGGIPW QAYFQRVLSSS 265 

I : I : III: : I I I I I III 

Db 246 GDNLTISASCYTPRADSFHIFRDAVTGDIPWPGIIFGMPITALWYWCTNQVIVQRCLCGK 305 

Qy 266 SATYAQVLS FLAAFGCLVMAI PAILI GAI GASTDWNQTAYGLP DPKTTEEA 316 

s : : : I : I : : : I I : I : I I II 

Db 306 DMSHVKAACIMCAYLKLLPMFLMVMPGMISRILYTDMVACWPSECVKHCGVDVGCTNYA 365 

Qy 317 DMI LPI VLQYLCPVYI S FFGLGAVSAAVMSSADS S I LSAS SMFARNI YQLS FRQNAS DKE 376 

I : : II : I : I :: I I I I I I I : : I : : I I : I I : I I 
Db 366 YPTMVLELMPQGLRGLMLSVMLASLMSSLTSIFNSASTLFTIDLY-TKMRKQASEKE 421 

Qy 377 I VWVMRI T VFVFGAS ATAMALLT KT VYG LWYLSSDLVYI - -VI FPQLLCVLFVKGTN 431 

: : III: : I : : I I I : I : : I I I 

Db 422 LLI AGRI FVLLLTWS I VWVPLVQVSQNGQLIHYTES I S S YLGPPIAAVFVLAI FCKRVN 481 

Qy 432 TYGAVAGYVSGLFL RITGGEPYLYLQPLI FYPGYYPD 468 

I I I : I I : : I I I I I : : I : 
Db 482 EQGAFWGLMVGLAMGLIRMITEFAYGTGSCLAPSNCPKIICGVHYLYFSIVLFF 535 

Qy 469 DNGIYNQKFPFKTIAMWSFLTNICISYLAKYLFESGTLPPKLDVFDA--WARHSEENM 526 

I I : I I I I : I : : : I : : I I : 

Db 536 GSMLVTLGI SLLTKPI PDVHLYRLCWVLRNSTEERI 571 



Qy 

Db 



527 D 527 
I 

572 D 572 



RESULT 8 
SL51__SHEEP 

ID SL51_SHEEP STANDARD; . PRT; 664 AA. 

AC P53791; 

DT 01-OCT-1996 (Rel. 34, Created) 

DT 01-OCT-1996 (Rel. 34, Last sequence update) 

DT 10-OCT-2003 (Rel. 42, Last annotation update) 

DE Sodium/glucose cotransporter 1 (Na ( + ) /glucose cotransporter 1) 

DE (High affinity sodium-glucose cotransporter) . 

GN SLC5A1 OR SGLT1. 

OS Ovis aries (Sheep) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 

OC Mammalia; Eutheria; Cetartiodactyla; Ruminantia; Pecora; Bovoidea; 

OC Bovidae; Caprinae; Ovis. 

OX NCBI_TaxID=994 0; 

RN [1] 

RP SEQUENCE FROM N.A. 

RC TISSUE=Jejunal mucosa; 

RX MEDLINE=96077158; PubMed-74 92327 ; 

RA Tarpey P., Wood I.S., Shirazi-Beechey S.P., Beechey R.B.; 

RT "Amino acid sequence and the cellular location of the Na ( + ) -dependent 

RT D-glucose symporters (SGLT1) in the ovine enterocyte and the parotid 

RT acinar cell."; 

RL Biochem. J. 312:293-300(1995). 

RN [2] 

RP SEQUENCE FROM N.A. 

RC TISSUE^Mammary gland; 

RX MEDLINE=98050042; PubMed=9388688 ; 

RA Shillingford J.M. , Wood I.S., Shennan D.B., Shirazi-Beechey S.P., 

RA Beechey R.B. ; 

RT "Determination of the sequence of a mRNA from lactating sheep mammary 

RT gland that encodes a protein identical to the Na ( + ) -dependent glucose 

RT transporter (SGLT1) . " ; 

RL Biochem. Soc. Trans. 25:4 67-4 67(1997). 

CC -!- FUNCTION: Actively transports glucose into cells by Na( + ) co- 

CC transport with a Na(+) to glucose coupling ratio of 2:1. 

CC -!- FUNCTION: Efficient substrate transport in mammalian kidney is 

CC provided by the concerted action of a low affinity high capacity 

CC and a high affinity low capacity Na (+) /glucose cotransporter 

CC arranged in series along kidney proximal tubules. 

CC -!- SUBCELLULAR LOCATION: Integral membrane protein. 

CC -!- SIMILARITY: BELONGS TO THE SODIUM: SOLUTE SYMPORTER FAMILY (SSF). 

CC 

CC This SWISS-PROT entry is copyright. It is produced through a collaboration 

CC between the Swiss Institute of Bioinf ormatics and the EMBL outstation - 

CC the European Bioinf ormatics Institute. There are no restrictions on its 

CC use by non-profit institutions as long as its content is in no way 

CC modified and this statement is not removed. Usage by and for commercial 

CC entities requires a license agreement (See http://www.isb-sib.ch/announce/ 

CC or send an email to license@isb-sib.ch). 

CC 

DR EMBL; X82411; CAA57809.1; -. 

DR EMBL; X82410; CAA57808.1; -. 

DR EMBL; AJ001026; CAA04483.1; -. 

DR PIR; S59637; S59637. 



DR 


InterPro; 


IPR001734; Na/solut_symport . 


DR 


Pfam; PF00474; SSF; 1. 




DR 


TIGRFAMs; 


TIGR008 


13; sss; 


1. 


DR 


PROSITE; 


PS00456; 


NA_SOLUT 


_SYMP_1; 1. 


DR 


PROSITE; 


PS00457; 


NA SOLUT 


_SYMP_2; 1. 


DR 


PROSITE; 


PS50283; 


NA_SOLUT 


_SYMP_3; 1. 


KW 


Transport 


; Sugar 


transport 


; Transmembrane; Sodium transport; Symp< 


KW 


Glycoprotein. 
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292 


314 


CYTOPLASMIC (POTENTIAL) . 


FT 


TRANSMEM 


315 


334 


POTENTIAL. 


FT 


DOMAIN 


335 


423 


EXTRACELLULAR (POTENTIAL) . 


FT 


TRANSMEM 


424 


443 


POTENTIAL. 


FT 


DOMAIN 


444 


455 


CYTOPLASMIC (POTENTIAL) . 


FT 


TRANSMEM 


456 


476 


POTENTIAL. 


FT 


DOMAIN 


477 


526 


EXTRACELLULAR (POTENTIAL) . 


FT 


TRANSMEM 


527 


547 


POTENTIAL. 


FT 


DOMAIN 


548 


642 


CYTOPLASMIC (POTENTIAL) . 


FT 


TRANSMEM 


643 


663 


POTENTIAL. 


FT 


CARBOHYD 


248 


248 


N-LINKED (GLCNAC. . .) (POTENTIAL). 


SQ 


SEQUENCE 


664 AA; 73178 


MW; 820AC019B5C93987 CRC64 ; 



Query Match 9.9%; Score 294; DB 1; Length 664; 

Best Local Similarity 23.9%; Pred. No. 2.8e-13; 

Matches 127; Conservative 93; Mismatches 202; Indels 110; Gaps 23; 

11 IIVFYLLILLVGIWAAWRTKNSGSAEERSEAIIVGGRDIGLLVGGFTMTATWVGGGY 67 

| :::::::: I I : I I : I I I : : I I : I : : I : : I I : 

32 IVIYFVWMAVGLWAMFST-NRGTV GGFFLAGRSMVWWPIGASLFASNIGSGHFVG 86 

68 INGTAEAVYVPGYGLAWAQAPIGYSLSLILGGLFFAKPMRSK-GYVTMLDPFQQIYGKRM 12 6 

: | I I : || I ::| I : I I : I I I I I • I I 

87 LAGTGAAAGIATGGFEWN AL I LWLLGWVFV- - P I YI KAGWTM PEYLRKRF 136 

127 GG LLFI PALMGEMFWAAAI FSALGATI SVI I DVDMHI SVI I SALIATL 174 

|| :|:| : :: Ml |:: : :|::::: I I I 

137 GGQRIQVYLSVLSLVLYI FTKI SADI FSGAI F INLALGLDLYLAI FI LLAI TAL 190 

175 YT LVGGL YS VAYT D WQL FC I FVGLW I S VP FAL S H P AVAD I G FT AVHAK YQ K PWLGT VD S 234 

I I : I I I : I I I I : I : :| :| II l"l M ' \\ \ 

191 YT ITGGLAAVI YTDTLQTVIMLLGS FI LTGFAFHEVG GYSAFVTKYMNA-IPTVTS 245 

235 SEVYS-WLDSFLLL MLGGIPW QAYFQRVLSSS 265 

II: III: : I : I I I M I I : 

246 YGNTTVKKECYTPRADS FHI FRDPLKGDLPWPGLI FGLT 1 1 SLWYWCTDQVI VQRCLSAK 305 



Qy 

Db 

Qy 

Db 

QY 
Db 

Qy 

Db 

Qy 

Db 



Qy 2 66 SAT YAQVL S FLAAFGC LVMAI P AI L I GAI GAS T DWNQT AYG L P D P KT T EE AD 317 

: :: : : : |: : : I I : I : I h = 

Db 306 NMSHVKAGCIMCGYMKIjLPMFLMVMPGMI S RI LFTEKVACTV — PSECEKYCGTKVGCTN 363 

Qy 318 MILPIVLQYLCPVYISFFGLGAVSAAVMSSADSSILSASSMFARNIYQLSFRQNASDKEI 377 

: I : : II : I : I : : ' * I : I I I : I I : I I : 

Db 364 IAYPTLWELMPNGLRGLMLSVMLASLMSSLTSIFNSASTLFTMDIY-TKIRKKASEKEL 422 

Qy 378 VWVMRI TVFV- FGAS ATAMALLTKTVYG — LWYLS S DLVYI - -VI FPQLLCVL FVKGTNT 432 

: I : : I I I : : : I I : I I : I I : I I I 

Db 423 MIAGRLFMLVLIGVSIAWVPIVQSAQSGQLFDYIQSITSYLGPPIAAVFLLAI FCKRVNE 482 

Qy 433 YGAVAGYVSGLFLRI TG GEPYLYLQPLIF 4 61 

I I I : I : : II I I I I : : I 

Db 483 PGAFWGLIIGFLIGVSRMITEFAYGTGSCMEPSNCPTIICGVHYLYFAIILF 534 



RESULT 9 
SGLT_VIBPA 

ID SGLT_VIBPA STANDARD; PRT; 543 AA. 

AC P96169; 

DT 30-MAY-2000 (Rel. 39, Created) 

DT 30-MAY-2000 (Rel. 39, Last sequence update) 

DT 28-FEB-2003 (Rel. 41, Last annotation update) 

DE Sodium/glucose cotransporter (Na (+) /glucose symporter) . 

GN SGLT. 

OS Vibrio parahaemolyticus. 

OC Bacteria; Proteobacteria; Gammaproteobacteria ; Vibrionales; 

OC Vibrionaceae; Vibrio. 

OX NCBI_TaxID=670; 

RN [1] 

RP SEQUENCE FROM N.A. 

RC STRAIN=AQ3334; 

RX MEDLINE=96248401; PubMed=8652595 ; 

RA Sarker R.I., Okabe Y. , Tsuda M. , Tsuchiya T.; 

RT "Sequence of a Na+/glucose symporter gene and its flanking regions of 

RT Vibrio parahaemolyticus."; 

RL Biochim. Biophys . Acta 1281:1-4(1996). 

RN [2] 

RP SEQUENCE FROM N.A. 

RX MEDLINE=204 00508; PubMed=108 35424 ; 

RA Turk E. , Kim O., Le Coutre J., Whitelegge J. P., Eskandari S., 

RA Lam J.T., Kreman M. , Zampighi G., Faull K.F., Wright E.M.; 

RT "Molecular characterization of Vibrio parahaemolyticus vSGLT: a model 

RT for sodium-coupled sugar cotransporters. "; 

RL J. Biol. Chem. 275:25711-25716(2000). 

RN [3] 

RP MASS SPECTROMETRY OF FORMYLATED FORM, AND REVISIONS TO N-TERMINUS . 

RX MEDLINE=20222957; PubMed=10757971 ; 

RA le Coutre J., Whitelegge J. P., Gross A., Turk E., Wright E.M., 

RA Kaback H.R., Faull K.F.; 

RT "Proteomics on full-length membrane proteins using mass 

RT spectrometry."; 

RL Biochemistry 39:4237-4242(2000). 

CC -!- FUNCTION: ACTIVELY TRANSPORTS GLUCOSE INTO CELLS BY NA(+) CO- 
CC TRANSPORT (BY SIMILARITY) . 

CC -!- SUBCELLULAR LOCATION: Integral membrane protein. 



CC -!- MASS SPECTROMETRY: MW=60680; METHOD=Electrospray . 

CC -!- SIMILARITY: BELONGS TO THE SODIUM : SOLUTE SYMPORTER FAMILY (SSF) . 

CC 7~ 

CC This SWISS-PROT entry is copyright. It is produced through a collaboration 

CC between the Swiss Institute of Bioinf ormatics and the EMBL outstation - 

CC the European Bioinf ormatics Institute. There are no restrictions on its 

CC use by non-profit institutions as long as its content is in no way 

CC modified and this statement is not removed. Usage by and for commercial 

CC entities requires a license agreement (See http://www.isb-sib.ch/announce/ 

CC or send an email to license@isb-sib . ch) . 

CC ~ 

DR EMBL; D78137; BAA11215.1; ALT_FRAME. 

DR EMBL; AF255301; AAF80602.1; -. 

DR InterPro; IPR001734; Na/solut_symport . 

DR Pfam; PF00474; SSF; 1. 

DR TIGRFAMs; TIGR00813; sss; 1. 

DR PROSITE; PS00456; NA_S0LUT_SYMP_1 ; 1. 

DR PROSITE; PS00457; NA_S0LUT_SYMP_2 ; FALSE_NEG. 

DR PROSITE; PS50283; NA_S0LUT_SYMP_3 ; 1. 

KW Transport; Sugar transport; Transmembrane; Sodium transport; Symport. 

FT TRANSMEM 10 30 POTENTIAL. 

FT TRANSMEM 45 65 POTENTIAL. 

FT TRANSMEM 79 99 POTENTIAL. 

FT TRANSMEM 129 14 9 POTENTIAL. 

FT TRANSMEM 156 176 POTENTIAL. 

FT TRANSMEM 193 213 POTENTIAL. 

FT TRANSMEM 246 266 POTENTIAL. 

FT TRANSMEM 287 307 POTENTIAL. 

FT TRANSMEM 345 365 POTENTIAL. 

FT TRANSMEM 4 01 421 POTENTIAL. 

FT TRANSMEM 427 447 POTENTIAL . 

FT TRANSMEM 455 475 POTENTIAL. 

FT TRANSMEM 483 503 POTENTIAL. 

FT TRANSMEM 523 543 POTENTIAL. 

SQ SEQUENCE 543 AA; 58874 MW; 61BE3F7E380BC32C CRC64; 

Query Match 9.9%; Score 293.5; DB 1; Length 543; 

Best Local Similarity 25.1%; Pred. No. 2.4e-13; 

Matches 139; Conservative 96; Mismatches 198; Indels 121; Gaps 27; 

Qy 4 HVEGLIAIIVF— YLLILL-VGIWAAWRTKNSGSAEERSEAIIVGGRDIGLLVGGFTMTA 60 

| ||:|| | : | : : I I : I : : : : : : I : I ' : I I 

Db 6 HGLS FI DIMVFAI YVAI 1 1 GVGLWV S RDKKGTQKSTEDYFLAGKS L PWWAVGAS LI A 62 

Qy 61 TWV GGGYINGTAEAVYVPGYGLAWAQAPIGYSLSLILGGLFFAKPMRSKGY 111 

I II I I I I II s:M: I : I II 

Db 63 ANISAEQFIGMSGSGYSIGLAIASY EWMSA ITLIIVGKYFLPIFIEKGI 111 

Qy 112 VTMLDPFQQIYGKRMGGLLFIPALMGEMFWAAA-IFSAL GATISVIIDVDMHI 163 

| : : : : : | : : : I : II : II I | : | : : : 

Db 112 YTIPEFVEKRFNKKLKTILAV FWISLYIFVNLTSVLYLGGLALETILGIPLMY 164 

Qy 164 S VI I SAL I ATL YT LVGG L Y S VAYT DWQ L FC I FVGLWI S VP FAL S H PAVAD I GFT AVHAK 223 

| : : | | I : I : : I I I : I : I I i : I : I : : I I : : I I : I I I 
Db 165 SILGLALFALVYSIYGGLSAWWTDVIQVFFLVLG GFMTTYMAVSFIGGT 214 



Qy 



224 YQKPWLGTV- 



DS S EVYSWLDS FLLLMLGGI PW- 



QAY 257 



Db 


215 


II :::::: 
— DGWFAGVSKMVDAAPGHFEMILDQSNPQYMNLPG-IAVLIGGL-WVANLYYWGFNQYI 


270 


Qy 


258 


FQRVL S S S S AT YAQVL S FLAAFGCLVMAI PAI L I G - AI GAS T DWN QT AYGL PDPKTTE 

|| | . . | . || ill | : : : I I 1 1 1 II 1 
IQRTI^^SVSEAQKGIVF7^FLKLIVPFLWLPGIAAYVITSDPQLMASLGDIAATNLP 


314 


Db 


271 


330 


Qy 


315 


EADMI LP I VLQ YLCPVYI S FFGLGAVSAAVMS S ADS S I LSAS SMFARN I YQLS FRQN 

ll 1 • 1 • 1 i 1 * 1 : : II : : 1 1 1 : I : : : I : I 1 : : 


371 


Db 


331 


II i • i • i i i * i • • i i • • i i i • i i »• 

SAANADKAYPWLTQFL-PVGVKGWFT^AIJ^IVSSIJVSMLNSTATIFTMDIYKEYISPD 


389 


Qy 


372 


AS DKEI VWVMRI TVFVFGAS ATAMALLT KTVYG — LW YLS S DLVY I VIFPQLLCV 
: | : : I I | : | : | | : : | : II : : | : | I 
SGDHKLVNV GRTAAWAL 1 1 ACLI APMLGGI GQAFQ YI QEYTGLVS PGI IiAV 


424 


Db 


390 


441 


Qy 


425 


LFVKGTNTYGAVAGYVSGLFLRITGGEPY-LYLQPLIFYPGYYP-DDNGIYNQKFP 

II 1 1 : II: 1 1: : 1 : 1 : 1 : 1 1 1 1 : 1 1 
FLLGLFWKKTTS KGAI I GWAS I P FAL FLK FMP L SMP FMDQML YTLL FT 


478 


Db 


442 


490 


Qy 


479 


FKTLAMVTSFLTNI 492 




Db 


491 


: 1 II 1 : 1 

MWIAF-TSLSTSI 503 





RESULT 10 
SL52_HUMAN 

ID SL52_HUMAN STANDARD; PRT; 672 AA. 

AC P31639; 

DT 01-JUL-1993 (Rel. 26, Created) 

DT 01-JUL-1993 (Rel. 26, Last sequence update) 

DT 16-OCT-2001 (Rel. 40, Last annotation update) 

DE Sodium/glucose cotransporter 2 (Na ( + ) /glucose cotransporter 2) 

DE (Low affinity sodium-glucose cotransporter) . 

GN SLC5A2 OR SGLT2 . 

OS Homo sapiens (Human) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 

OC Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo. 

OX NCBI_TaxID=9606; 

RN [1] 

RP SEQUENCE FROM N.A. 

RC TISSUE=Kidney; 

RX MEDLINE=93035768; PubMed=14 15574 ; 

RA Wells R.G., Pajor A.M., Kanai Y., Turk E., Wright E.M., Hediger M.A. ; 

RT "Cloning of a human kidney cDNA with similarity to the sodium-glucose 

RT cotransporter."; 

RL Am. J. Physiol. 263 : F459-F465 ( 1992 ) . 

cc _i_ FUNCTION: Sodium-dependent glucose transporter. Has a Na+ to 
CC glucose coupling ratio of 1:1. 

CC -!- FUNCTION: Efficient substrate transport in mammalian kidney is 
CC provided by the concerted action of a low affinity high capacity 

CC and a high affinity low capacity Na (+) /glucose cotransporter 

CC arranged in series along kidney proximal tubules. 

cc _i_ SUBCELLULAR LOCATION: Integral membrane protein. 

CC -!- SIMILARITY: BELONGS TO THE SODIUM: SOLUTE SYMPORTER FAMILY (SSF) . 

CC 7"~ 

CC This SWISS-PROT entry is copyright. It is produced through a collaboration 

CC between the Swiss Institute of Bioinf ormatics and the EMBL outstation - 



cc 


the European Bioinf ormatics 


Institute. There are no restrictions on 


cc 


use by 


non-profit institutions as long as its content is in no 


cc 


modified 


and this 


statement 


is not removed. Usage by and for commerc 


cc 


entities 


requires 


a license 


agreement (See http://www.isb-sib.ch/announ 


cc 
cc 

DR 


or send an email to license@isb~sib . ch) . 


EMBL; M95549; AAA36608.1; - 




DR 


PIR; A56765; A56765. 




DR 


Genew; HGNC: 11037; 


SLC5A2 . 




DR 


MIM; 182381; 






DR 


GO; GO:0016021; C: 


integral 


to membrane; TAS. 


DR 


GO; GO: 0005362; F: 


low-affinity glucose : sodium symporter activity; TAS. 


DR 


GO; GO:0005975; P: 


carbohydrate metabolism; TAS. 


DR 


GO; GO: 0006810; P: 


transport 


; TAS. 


DR 


InterPro; 


IPR001734; Na/solut_symport . 


DR 


Pfam; PF00474; SSF; 1. 




DR 


TIGRFAMs ; 


TIGR00813; sss; 1 




DR 


PROSITE; 


PS00456; 


NA_SOLUT 


SYMP_1; 1. 


DR 


PROSITE; 


PS00457; 


NA SOLUT 


SYMP_2; 1. 


DR 


PROSITE; 


PS50283; 


NA SOLUT 


SYMP_3; 1. 


KW 


Transport; Sugar transport; 


Transmembrane; Sodium transport; Symport; 


KW 


Glycoprotein, 






FT 


DOMAIN 


1 


25 


CYTOPLASMIC (POTENTIAL) . 


FT 


TRANSMEM 


26 


44 


POTENTIAL . 


FT 


DOMAIN 


45 


61 


EXTRACELLULAR (POTENTIAL). 


FT 


TRANSMEM 


62 


82 


POTENTIAL. 


FT 


DOMAIN 


83 


102 


CYTOPLASMIC (POTENTIAL) . 


FT 


TRANSMEM 


103 


123 


POTENTIAL. 


FT 


DOMAIN 


124 


168 


EXTRACELLULAR (POTENTIAL). 


FT 


TRANSMEM 


169 


188 


POTENTIAL. 


FT 


DOMAIN 


189 


205 


CYTOPLASMIC (POTENTIAL). 


FT 


TRANSMEM 


206 


226 


POTENTIAL. 


FT 


DOMAIN 


227 


270 


EXTRACELLULAR (POTENTIAL) . 


FT 


TRANSMEM 


271 


291 


POTENTIAL. 


FT 


DOMAIN 


292 


314 


CYTOPLASMIC (POTENTIAL). 


FT 


TRANSMEM 


315 


334 


POTENTIAL. 


FT 


DOMAIN 


335 


423 


EXTRACELLULAR (POTENTIAL) . 


FT 


TRANSMEM 


424 


443 


POTENTIAL . 


FT 


DOMAIN 


444 


455 


CYTOPLASMIC (POTENTIAL). 


FT 


TRANSMEM 


456 


476 


POTENTIAL. 


FT 


DOMAIN 


477 


526 


EXTRACELLULAR (POTENTIAL) . 


FT 


TRANSMEM 


527 


547 


POTENTIAL. 


FT 


DOMAIN 


548 


650 


CYTOPLASMIC (POTENTIAL) . 


FT 


TRANSMEM 


651 


671 


POTENTIAL. 


FT 


CARBOHYD 


250 


250 


N-LINKED (GLCNAC. . .) (POTENTIAL). 


FT 


CARBOHYD 


399 


399 


N-LINKED (GLCNAC. . .) (POTENTIAL). 


FT 


SITE 


40 


40 


IMPLICATED IN SODIUM COUPLING 


FT 








(BY SIMILARITY) . 


FT 


SITE 


300 


300 


IMPLICATED IN SODIUM COUPLING 


FT 








(BY SIMILARITY) . 


SQ 


SEQUENCE 


672 AA; 


72896 MW; 233C65F1601B0337 CRC64; 



Query Match 9.8%; Score 292; DB 1; Length 672; 

Best Local Similarity 24.1%; Pred. No. 3.9e-13; 

Matches 147; Conservative 91; Mismatches 237; Indels 136; Gaps 22; 



8 LIAIIVFYLLILLVGIWAAWRTKNSGSAEERSEAIIVGGRDIGLLVGGFTMTATWVGGGY 67 



I . . i i . . I 1 * I • Mil- I I I • • I • • I I • 

Db 26 ILVIAAYFLLVT GVGLWSMCRT-NRGTV GGYFLAGRSMVWWPVGASLFASNIGSGH 80 

Qy 68 INGTAEAVYVPG YGLAWAQAP I GYS LS LILGGLFFAKPMRSKGYVTMLDPFQQIYG 123 

II I I I II:: : : I I I I : I : I I I 

Db 81 FVGLA GT GAAS G LAVAG F EWN AL FWLLL GW L FAP VYLT AGVI TM PQYLR 130 

Qy 124 KRMGG LLFI PALMGEMFWAAAI F — S ALGAT I SVI I DVDMHI S VI I SA 169 

I I I I I : I : : : I : I II I I : I I I 

Db 131 KRFGGRRIRLYLSVLSLFLYI FTKISVDMFSGAVFIQQALGWNI YASVIALL 182 

Qy 170 L I AT L YT LVGGL Y S VAYT DWQL FC I FVGLW I S VP FAL S H P AVAD I G FT AVHAK Y 224 

I : I I : I I I :: I I I I I I I I I : : I I : : : II 

Db 183 G I TMI YT VT GGLAALMYT DT VQT FVI LGGAC I LMG YAFHEVG GYSGLFDKYLGAAT 238 

Qy 225 QKPWLGTVDSSEVYSWLDSFLLL MLGGIPW QAYF 258 

: I : I : I I I : I I : I : I I I 

Db 239 SLTVSEDPAVGNISSFCYRPRPDSYHLLRHPVTGDLPWPALLLGLTIVSGWYWCSDQVIV 298 

Qy 259 Q RVL S S S S AT YAQVL S FLAAFGC LVMAI P AI L I GAI GAS T DWN QT AYGL P D P KT TE 314 

I I I : I I : : I : I :: I I : : | : | : | | 

Db 299 QRCLAGKSLTHIKAGCILCGYLKLTPMFLMVMPGMISRILYPDEVACWPEVCRRVCGTE 358 

Qy 315 E — ADMI L PI VLQ YLCPVYI S FFGLGAVSAAVMS SADSS I LS AS SMFARNI YQLS FRQNA 372 

: : : I : : I I : I : I I : I I I I I : I : : I : I I I I 

Db 359 VGCSNIAYPRLWKLMPNGLRGLMIAVMLAALMSSLASIFNSSSTLFTMDIY-TRLRPRA 417 

Qy 373 S DKEI VWVMRI -TVFVFGASATAMALLTKTVYGLWYLS SDLVYI VI FPQLLCV LFV 427 

I : I : : I I : I I : I : : : I : I : I : I III 

Db 418 GDRELLLVGRLWWFIVWSVAWLPWQ7^QGGQLFDYIQAVSSYLAPPVSAVFVLALFV 477 

Qy 42 8 KGTNTYGAVAGYVSGLFLRITGGEPYLYLQPLIFYPGYYPDDNGIYNQKFPFKTLAMV — 4 85 

I I I I : I I : : I : I I : : I 

Db 47 8 P RVN EQGAFWGL I GGLLMGLARL I P EFSFGSGSCVQP 514 

Qy 486 TSFLTNICISYLAKYLFE-SGTLPPKLDVFDAW ARHSEENMDKTI 530 

: I I : I I I I I I I : : I : M I : I 
Db 515 SAC PAFLCGVH YL YFAI VLFFC S GLLTLTVS LCTAPI PRKHLHRLVFS LRHS KE 568 

Qy 531 LVKNENIKLDE 541 

: I : : I I 
Db 569 — EREDLDADE 577 



RESULT 11 
SL54 MOUSE 



ID SL54 iMOUSE STANDARD; PRT; 656 AA. 

AC Q9ET37; 

DT 16-OCT-2001 (Rel. 40, Created) 

DT 16-OCT-2001 (Rel. 40, Last sequence update) 

DT 28-FEB-2003 (Rel. 41, Last annotation update) 

DE Low affinity sodium-glucose cotransporter (Sodium/glucose 

DE cotransporter 3) (Na (+) /glucose cotransporter 3). 

GN SLC5A4A. 

OS Mus musculus (Mouse) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 

OC Mammalia; Eutheria; Rodentia; Sciurognathi ; Muridae; Murinae; Mus. 



ox 


NCBI TaxID=10090; 






RN 


[1] 






RP 


SEQUENCE FROM N.A. 






RC 


STRAIN-129/Sv; 






RX 


MEDLINE=20499361; 


PubMed=11042146; 


RA 


Pletcher M.T., Roe 


B.A. , Chen F., Do T., Do A., Malaj E., Reeves R.H.; 


RT 


"Chromosome evolution: the 


junction of mammalian chromosomes in the 


RT 


formation of mouse 


chromosome 10 . " ; 


RL 


Genome Res. 10:1463-1467(2000). 


CC 


-!- FUNCTION: Sodium-dependent glucose transporter (By similarity). 


CC 


-!- SUBCELLULAR LOCATION: Integral membrane protein. 


CC 


-!- SIMILARITY: BELONGS TO 


THE SODIUM:SOLUTE SYMPORTER FAMILY (SSF). 


CC 
CC 


This SWISS-PROT entry is copyright. It is produced through a collaboration 


CC 


between the Swiss 


Institute of Bioinf ormatics and the EMBL outstation - 


CC 


the European Bioinf ormatics 


Institute. There are no restrictions on its 


CC 


use by non-profit institutions as long as its content is in no way 


CC 


modified and this 


statement 


is not removed. Usage by and for commercial 


CC 


entities requires 


a license 


agreement (See http://www.isb-sib.ch/announce/ 


CC 


or send an email to license@isb-sib . ch) . 


CC 
DR 


EMBL ; AF251267; AAG01741.1; 




DR 


MGD; MGI: 1927848; 


Slc5a4a. 




DR 


InterPro; IPR001734; Na/solut_symport . 


DR 


Pfam; PF00474; SSF; 1. 




DR 


TIGRFAMs; TIGR00813; sss; 1 




DR 


PROSITE; PS00456; 


NA_SOLUT_ 


SYMP 1; FALSE_NEG. 


DR 


PROSITE; PS00457; 


NA_SOLUT_ 


SYMP_2; 1. 


DR 


PROSITE; PS50283; 


NA SOLUT 


SYMP_3; 1. 


KW 


Transport; Sugar transport; 


Transmembrane; Sodium transport; Symport; 


KW 


Glycoprotein . 






FT 


DOMAIN 1 


28 


CYTOPLASMIC (POTENTIAL) . 


FT 


TRANSMEM 29 


47 


POTENTIAL. 


FT 


DOMAIN 4 8 


64 


EXTRACELLULAR (POTENTIAL) . 


FT 


TRANSMEM 65 


85 


POTENTIAL. 


FT 


DOMAIN 8 6 


105 


CYTOPLASMIC (POTENTIAL) . 


FT 


TRANSMEM 106 


126 


POTENTIAL. 


FT 


DOMAIN 127 


171 


EXTRACELLULAR (POTENTIAL) . 


FT 


TRANSMEM 172 


191 


POTENTIAL . 


FT 


DOMAIN 192 


208 


CYTOPLASMIC (POTENTIAL) . 


FT 


TRANSMEM 209 


229 


POTENTIAL . 


FT 


DOMAIN 230 


270 


EXTRACELLULAR (POTENTIAL) . 


FT 


TRANSMEM 271 


291 


POTENTIAL. 


FT 


DOMAIN 292 


314 


CYTOPLASMIC (POTENTIAL) . 


FT 


TRANSMEM 315 


334 


POTENTIAL. 


FT 


DOMAIN 335 


423 


EXTRACELLULAR (POTENTIAL) . 


FT 


TRANSMEM 424 


443 


POTENTIAL. 


FT 


DOMAIN 44 4 


455 


CYTOPLASMIC (POTENTIAL) . 


FT 


TRANSMEM 456 


476 


POTENTIAL. 


E 1 


DOMAIN 477 


526 


EXTRACELLULAR (POTENTIAL) . 


FT 


TRANSMEM 527 


547 


POTENTIAL. 


FT 


DOMAIN 54 8 


634 


CYTOPLASMIC (POTENTIAL) . 


FT 


TRANSMEM 635 


655 


POTENTIAL. 


FT 


CARBOHYD 24 8 


248 


N-LINKED (GLCNAC. . .) (POTENTIAL). 


SQ 


SEQUENCE 656 AA; 71837 MW; A6668E815204D39B CRC64; 


Query Match 


9.8%, 


Score 290; DB 1; Length 656; 



Best Local Similarity 22.3%; Pred. No. 5.2e-13; 

Matches 147; Conservative 102; Mismatches 252; Indels 158; Gaps 25; 



Qy 11 IIVFYLLILLVGIWAAWRTKNSGSAEERSEAIIVGGRDIGLLVGGFTMTATWVGGGYING 70 

| :::::::: I I : I I : I I : I I : | : : I : : I I : I 

Db 32 IVIYFWVMAVGVWAMLKTNRSTVG GFFLAGRSMTWWPMGASLFASNIGSGHFVG 86 

Qy 71 TAEAVYVP GYGLAWAQAP I G YS L S L I LGGL FFAKPMRS K- G YVTMLD P FQQ I YGKRMGG- 128 

| I : I : : | | : | | : | I : I I : I I : I I I I 

Db 87 LAGT GAAS G I AVT ~ AFE S H S FAL L LVL GW I FV- - P I Y I KAGVMTM PEYLKKRFGGK 139 



Qy 



12 9 LLFIPALM GEMFWAAAI FSALGATI SVI I DVDMHI SVI I SALI ATLYT 176 

Ml":: : : | : | | I | :::::::::: I I : : I 

Db 140 RLQIYLSILFLFICVTLTISADIF-SGAIF 1 KLALGLNLYLAI LI LLAITAI FT 192 



Qy 177 LVGGLYSVAYTDWQLFCIFVGLWISVPFALSHPAVADIGFTAVHAKYQKPWLGTVD 233 

: I I I I I I I I I I : I I : I : I I I I : : I : I : 

Db 193 I T GGLAS VI YT DTVQAVIMLVGS FI LMVFAF VE VGG Y E S FT E K FMNAI P S WE GDN 24 8 

Qy 234 SSEVYS-WLDSFLLL MLGGIPW QAYFQRVLSSSSAT 268 

: I I : I I I : : I I I I | | | | : : 

Db 249 LTINSRCYTPQPDSFHIFRDPVTGDIPWPGTAFGMPITALWYWCINQVIVQRCLCGKNLS 308 

Qy 2 69 YAQVLS FLAAFGCLVMAI PAI LI GAIGAST DWNQTAYGLP DPKTTEEADMI 319 

: : I : I : : : I I : I : I III 
Db 309 HVKAACILCGYLKLLPLFFMVMPGMISRILYTDMVACWPSECVKHCGVDVGCTNYA 365 

Qy 320 LPI VLQYLCP VYT 5 FFGLGAVSAAVMS SADSS I LSAS SMFARNI YQLS FRQNAS DKEI VW 379 

|::: I I : I : |::||| I I II:: I ::| |: ||::|:: 
Db 366 YPMLVLKLMPPGLRGLMLSVMLASLMSSLTSVFNSASTLFTIDLY-TKIRKKASERELLI 424 

Qy 380 VMRI T VFVFGAS AT AMALLT KTVYG LWYLSSDLVYI — VIFPQLLCVLFVKGTNTYG 434 

| : I I : : : : I : I : I : I I : I I I I 

Db 425 AGRLFVSVLIVTSILWVPIVEVSQGGQLVHYTEAISSYLGPPIAAVFLVAVFCKR7\NEQG 484 

Qy 435 AVAGYVSGLFL RITGGEPYLYLQPLI FYPGYYPDDNG 471 

I I : I I : : : : I : 

Db 485 AFWGLMVGLVMGLIRMIAEFSYGTGSCLAPSSCPKIICGVHYLYFAIILFF 535 

Qy 472 IYNQKFPFKTLAMVTSFLTNICISYLAKYLFESGTLPPKLDV FDAWARHSEENMD 527 

I : : I I I I III : I I : I 

Db 536 VCILVILGVSYLTK PIPDVHLHRLCWALRNSKEERID 572 

Qy 528 KTILVKNEN 1 KLDELALVKPR QSMTLS ST FTNKEAFLDVDS S P 570 

III : : I III I I : I I I I 

Db 573 LDAEDKEENGADDRTEEDQTEKPRGCLKKTCDLFCGLQRAEFKLTKVEEEALTDTTEKP 631 



RESULT 12 
SL53_MOUSE 

ID SL53_MOUSE STANDARD; PRT; 718 AA. 

AC Q9JKZ2; 

DT 16-OCT-2001 (Rel. 40, Created) 

DT 16-OCT-2001 (Rel. 40, Last sequence update) 

DT 16-OCT-2001 (Rel. 40, Last annotation update) 

DE Sodium/myo-inositol cotransporter (Na (+) /myo-inositol cotransporter) . 
GN SLC5A3 . 



OS Mus musculus (Mouse) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 

OC Mammalia; Eutheria; Rodentia; Sciurognathi ; Muridae; Murinae; Mus. 

OX NCBI_TaxID=10090; 

RN [1] 

RP SEQUENCE FROM N.A. 

RX MEDLINE=20237552; PubMed=10773690 ; 

RA McVeigh K.E., Mallee J.J., Lucente A., Barnoski B.L., Wu S., 

RA Berry G.T. ; 

RT "Murine chromosome 16 telomeric region, homologous with human 

RT chromosome 21q22, contains the osmoregulatory Na (+) /myo-inositol 

RT cotransporter (SLC5A3) gene."; 

RL Cytogenet. Cell Genet. 88:153-158(2000). 

CC -!- FUNCTION: Prevents intracellular accumulation of high 

CC concentrations of myo-inositol (an osmolyte) that result in 

CC impairment of cellular function. 

CC -!- SUBCELLULAR LOCATION: Integral membrane protein. 

cc _i_ SIMILARITY: BELONGS TO THE SODIUM: SOLUTE SYMPORTER FAMILY (SSF). 

CC 7~~ 

CC This SWISS-PROT entry is copyright. It is produced through a collaboration 
CC between the Swiss Institute of Bioinf ormatics and the EMBL outstation - 
CC the European Bioinf ormatics Institute. There are no restrictions on its 
CC use by non-profit institutions as long as its content is in no way 
CC modified and this statement is not removed. Usage by and for commercial 
CC entities requires a license agreement (See http://www.isb-sib.ch/announce/ 
CC or send an email to license@isb-sib . ch) . 

CC 

DR EMBL; AF220915; AAF43668.1; -. 
DR MGD; MGI : 1858226; Slc5a3. 
DR InterPro; IPR001734; Na/solut_symport . 
DR Pfam; PF00474; SSF; 1. 
DR TIGRFAMs; TIGR00813; sss; 1. 
DR PROSITE; PS00456; NA_J30LUT__SYMP_1 ; 1. 
DR PROSITE; PS00457; NA_SOLUT_SYMP_2 ; 1. 
DR PROSITE; PS502 83; NA_SOLUT__SYMP_3 ; 1. 
KW 
FT 
FT 
FT 
FT 
FT 
FT 
FT 
FT 
FT 
FT 
FT 
FT 
FT 
FT 
FT 
FT 
FT 
FT 
FT 
FT 
FT 



Transport; 


Transmembrane ; 


Sodium transport; Symport; Gl 


DOMAIN 


1 


9 


CYTOPLASMIC (POTENTIAL) . 


TRANSMEM 


10 


29 


POTENTIAL. 


DOMAIN 


30 


38 


EXTRACELLULAR (POTENTIAL). 


TRANSMEM 


39 


57 


POTENTIAL. 


DOMAIN 


58 


86 


CYTOPLASMIC (POTENTIAL) . 


TRANSMEM 


87 


110 


POTENTIAL. 


DOMAIN 


111 


123 


EXTRACELLULAR (POTENTIAL) . 


TRANSMEM 


124 


144 


POTENTIAL. 


DOMAIN 


145 


157 


CYTOPLASMIC (POTENTIAL) . 


TRANSMEM 


158 


183 


POTENTIAL. 


DOMAIN 


184 


186 


EXTRACELLULAR (POTENTIAL). 


TRANSMEM 


187 


205 


POTENTIAL. 


DOMAIN 


206 


303 


CYTOPLASMIC (POTENTIAL) . 


TRANSMEM 


304 


324 


POTENTIAL. 


DOMAIN 


325 


353 


EXTRACELLULAR (POTENTIAL). 


TRANSMEM 


354 


376 


POTENTIAL. 


DOMAIN 


377 


406 


CYTOPLASMIC (POTENTIAL) . 


TRANSMEM 


407 


430 


POTENTIAL. 


DOMAIN 


431 


443 


EXTRACELLULAR (POTENTIAL). 


TRANSMEM 


444 


4 62 


POTENTIAL. 


DOMAIN 


463 


510 


CYTOPLASMIC (POTENTIAL) . 



FT 


TRANSMEM 


511 


532 


POTENTIAL. 


FT 


DOMAIN 


533 


695 


CYTOPLASMIC (POTENTIAL) . 


FT 


TRANSMEM 


696 


716 


POTENTIAL . 


FT 


CARBOHYD 


32 


32 




FT 


SITE 


24 


24 


IMPLICATED IN SODIUM COUPLING 


FT 








(BY SIMILARITY) . 


FT 


SITE 


285 


285 


IMPLICATED IN SODIUM COUPLING 


FT 








(BY SIMILARITY) . 


SQ 


SEQUENCE 


718 AA; 


79554 


MW; D035CFBECDDA803B CRC64 ; 



Query Match 9.7%; Score 289; DB 1; Length 718; 

Best Local Similarity 21.7%; Pred. No. 6.8e-13; 

Matches 150; Conservative 113; Mismatches 209; Indels 218; Gaps 32; 

Qy 9 IAII-VFYLLILLVGIWAAWRTKNSGSAEERSEAIIVGGRDIGLLVGGFTMTATWVGGG- 66 

||:: :| :| I:: I : I : I :l III I 

10 IAWALYFILVMCIGFFAMWKSNRSTVS GYFLAGRSM — TWVAIGA 53 



Db 

Qy 



67 — YI NGT AEAVYVP G YGLAWAQAP I G Y S LSLILGGLFFAKPMRSKGYVTM 114 

::: :: III : I |:: I =11 * I : I I I II 

Db 54 SLFVSNIGSEHFI GLAGSGAASGFAVGAWEFNALLLLQLLGWVFIPIYIRS-GVYTM 109 

115 LDPFQQIYGKRMGG-- — LL F I P ALMGEM FWAAAI F SAL GAT I S VI I DVDMH 162 

: | | I I I I : I : : : I : I- I : : : : 

110 PEYLSKRFGGHRIQVYFAALSLLLYIFTKLSVDLYSGALF IQESLGWNLY 159 



Qy 

Db 

Qy 



163 I SVI I SALI ATLYTLVGGLYSVAYTDWQLFCI FVG LWISV PFAL 207 

: I I I : : I I : I I I : I I I I : I : : I : I I ; : I 

Db 160 VSVI LLI GMTALLTVTGGLVAVI YTDTLQALLMI I GALTLMVT SMVKI GGFEEVKRRYML 219 



Qy 208 SHPAVADI GFTAVHAKYQK PWLGTV-— DSSEVYSWLDS 243 

: I I I I III Ills : I : I 

Db 220 ASPDVASILLKYNLSNTNACMVIiPKANALKMLRDPTDEDVPWPGFILGQTPASVWYWCAD 279 

Qy 244 FLLLMLGGI PWQAYFQRVLSSSSATYAQ VLSFLAAFGCLVMAI PAIL 290 

I | | | | : : : : | : : I I : : : | : : 

Db 280 QVIVQRVLAAKNIAHAKGSTLMAGFLKLLPMFI I WPGMI SRIVFADEI 328 

Qy 291 IGAIGASTDWNQTAYGLPDPKTTEEADMILPIVLQYLCPVYISFFGLGAVSA 342 

: | : : M I :: I I I : : : I 

Db 329 ACINPEHCMQVCGSRAGCSNIAY PRLVMTLVPVGLRGLMMAVMIA 373 

Qy 343 AVMS SADS S I LSAS SMFARNI YQLS FRQNASDKEI VWVMRITV- FVFGAS ATAMALLTKT 401 

| : | I || | | I : : I : : I : I I : : I I : I : : I I I I I : I : : : : 
Db 374 ALMSDLDS I FN S AST I FTLDVYKL- 1 RKSAS SRELMI VGRI FVAFMWI S I AWVPI I VEM 432 

Qy 402 VYGLWYLS S DLVYI VI FPQL LCVLFVKGTNT YGAVAGYVSG LFLRITGG 450 

IN I : I : I : I I I I I : I I : 

Db 433 QGGQMYLYIQEVADYLTPPVAALFLLAI FWKRCNEQGAFYGGMAGFVLGAVRLILAFTYR 492 

Qy 45i E P YLYLQPLIFYPGYYPDDNGIYNQKFPFKTLAMVTSFLTNICI 4 94 

| | : | : : | : : | : | : : 

Db 4 93 APECDQPDNRPGFIKDIHYMYVATALFW ITGLIT-VIV 529 

Qy 495 SYLAKYLFESGTLPPKLDVFDAWARHSEENMDKTILV KNENI KLDELALVK- P 547 

|| III I I : II:: I : I I : I : : : s 

Db 530 SLL TPPPTKDQI RTTTFWSKKTLVTKESCSQKDEPYKMQEKSILQCS 576 



Qy 548 RQSMTLSSTFTNKEAFLDVDSSPEGSGTED 577 

I : I I I : : : . I : I II 
Db 577 ENSEVISHTIPNGKS EDSIKGLQPED 602 



RESULT 13 
OPUE_BACSU 

ID OPUE_BACSU STANDARD; PRT; 492 AA. 

AC 006493; 

DT 30-MAY-2000 (Rel. 39, Created) 

DT 30-MAY-2000 (Rel. 39, Last sequence update) 

DT 10-OCT-2003 (Rel. 42, Last annotation update) 

DE Osmoregulated proline transporter ( Sodium/proline symporter) . 

GN OPUE OR BSU06660. 

OS Bacillus subtilis. 

OC Bacteria; Firmicutes; Bacillales; Bacillaceae; Bacillus. 

OX NCBI_TaxID=1423; 

RN [1] 

RP SEQUENCE FROM N.A. 

RC STRAIN=168 / JH642; 

RA von Blohn C, Kempf B., Kappes R.M., Bremer E. ; 

RT • "Osmostress response in Bacillus subtilis: characterization of a 

RT proline uptake system (OpuE) regulated by high osmolarity and the 

RT alternative transcription factor sigma B." ; 

RL Mol. Microbiol. 25:175-187(1997). 

RN [2] 

RP SEQUENCE FROM N.A. 

RC STRAIN-168; 

RX MEDLINE=97124186; PubMed=8 9694 99 ; 

RA Borriss R. , Porwollik S., Schroeter R. ; 

RT "The 52 degrees-55 degrees segment of the Bacillus subtilis 

RT chromosome: a region devoted to purine uptake and metabolism, and 

RT containing the genes cotA, gabP and guaA and the pur gene cluster 

RT within a 34960 bp nucleotide sequence."; 

RL Microbiology 142:3027-3031(1996). 

RN [3] 

RP SEQUENCE FROM N.A. 

RC STRAIN=168; 

RX MEDLINE=98044033; PubMed=9384377 ; 

RA Kunst F. , Ogasawara N., Moszer I., Albertini A.M. , Alloni G., 

RA Azevedo V. , Bertero M.G., Bessieres P., Bolotin A., Borchert S., 

RA Borriss R. , Boursier L., Brans A., Braun M. , Brignell S.C., Bron S 

RA Brouillet S., Bruschi C.V., Caldwell B., Capuano V., Carter N.M., 

RA Choi S.K., Codani J. J., Connerton I.F., Cummings N.J., Daniel R.A. 

RA Denizot F. , Devine K.M., Dusterhoft A., Ehrlich S.D., Emmerson P.T 

RA Entian K.D., Errington J., Fabret C, Ferrari E., Foulger D., 

RA Fritz C, Fujita M. , Fujita Y. , Fuma S., Galizzi A., Galleron N., 

RA Ghim S.Y., Glaser P., Goffeau A., Golightly E.J., Grandi G. , 

RA Guiseppi G., Guy B.J., Haga K., Haiech J., Harwood C.R., Henaut A. 

RA Hilbert H., Holsappel S., Hosono S., Hullo M.F., Itaya M. , Jones L 

RA Joris B., Karamata D., Kasahara Y. , Klaerr-Blanchard M. , Klein C, 

RA Kobayashi Y., Koetter P., Koningstein G., Krogh S. f Kumano M. , 

RA Kurita K., Lapidus A., Lardinois S. f Lauber J., Lazarevic V., 

RA Lee S.M., Levine A., Liu H., Masuda S., Mauel C. , Medigue C, 

RA Medina N., Mellado R.P. f Mizuno M. , Moestl D. f Nakai S., Noback M. 

RA Noone D., O'Reilly M. , Ogawa K., Ogiwara A., Oudega B . , Park S.H., 



RA Parro V., Pohl T.M., Portetelle D., Porwollik S., Prescott A.M., 

RA Presecan E . , Pujic P., Purnelle B., Rapoport G. , Rey M. , Reynolds S., 

RA Rieger M. , Rivolta C, Rocha E., Roche B., Rose M. , Sadaie Y . , 

RA Sato T., Scanlan E. , Schleich S., Schroeter R., Scoffone F. , 

RA Sekiguchi J. , Sekowska A. , Seror S.J., Serror P., Shin B.S., Soldo B., 

RA Sorokin A., Tacconi E., Takagi T., Takahashi H., Takemaru K. , 

RA Takeuchi M. , Tamakoshi A. , Tanaka T., Terpstra P., Tognoni A., 

RA Tosato V., Uchiyama S., Vandenbol M. , Vannier F., Vassarotti A., 

RA Viari A., Wambutt R. , Wedler E. , Wedler H. , Weitzenegger T., 

RA Winters P., Wipat A., Yamamoto H. , Yamane K., Yasumoto K. , Yata K., 

RA Yoshida K. , Yoshikawa H.F., Zumstein E. , Yoshikawa H., Danchin A.; 

RT "The complete genome sequence of the Gram-positive bacterium Bacillus 

RT subtilis."; 

RL Nature 390:249-256(1997). 

CC -!- FUNCTION: CATALYZES THE SODIUM- DEPENDENT UPTAKE OF EXTRACELLULAR 
CC PROLINE . 

cc _i_ SUBCELLULAR LOCATION: Integral membrane protein. 

CC -!- SIMILARITY: BELONGS TO THE SODIUM: SOLUTE SYMPORTER FAMILY (SSF). 

CC 7" 

CC This SWISS-PROT entry is copyright. It is produced through a collaboration 

CC between the Swiss Institute of Bioinf ormatics and the EMBL outstation - 

CC the European Bioinf ormatics Institute. There are no restrictions on its 

CC use by non-profit institutions as long as its content is in no way 

CC modified and this statement is not removed. Usage by and for commercial 

CC entities requires a license agreement (See http://www.isb-sib.ch/announce/ 

CC or send an email to license@isb-sib . ch) . 

CC 

DR EMBL; U92466; AAB66512.1; -. 

DR EMBL; AF011545; AAB72182.1; -. 

DR EMBL; Z99107; CAB12486.1; -. 

DR PIR; H69670; H69670. 

DR SubtiList; BG12641; opuE. 

DR InterPro; IPR001734; Na/solut_symport . 

DR Pfam; PF00474; SSF; 1. 

DR TIGRFAMs; TIGR00813; sss; 1. 

DR PROSITE; PS00456; NA_S0LUT_SYMP_1 ; 1. 

DR PROSITE; PS00457; NA_S0LUT_SYMP_2 ; 1. 

DR PROSITE; PS50283; NA_S0LUT_SYMP_3 ; 1. 

KW Transport; Transmembrane; Sodium transport; Symport; 
KW 
FT 
FT 
FT 
FT 
FT 
FT 
FT 
FT 
FT 
FT 
FT 
FT 
SQ 

DB 1; Length 492; 
Best" Local Similarity 22.1%; Pred. No. 8.5e-13; 

Matches 118; Conservative 97; Mismatches 214; Indels 106; Gaps 18; 



Complete 


proteome . 






TRANSMEM 


3 


23 


POTENTIAL. 


TRANSMEM 


62 


82 


POTENTIAL. 


TRANSMEM 


125 


145 


POTENTIAL. 


TRANSMEM 


161 


181 


POTENTIAL. 


TRANSMEM 


191 


211 


POTENTIAL. 


TRANSMEM 


224 


244 


POTENTIAL. 


TRANSMEM 


271 


291 


POTENTIAL. 


TRANSMEM 


314 


334 


POTENTIAL. 


TRANSMEM 


365 


385 


POTENTIAL. 


TRANSMEM 


394 


414 


POTENTIAL. 


TRANSMEM 


424 


444 


POTENTIAL. 


TRANSMEM 


449 


469 


POTENTIAL. 


! SEQUENCE 


492 AA; 


53282 MW 


; 23459873 


Query Match 




9.6%; 


Score 285; 



Qy 5 VEGLIAIIVFYLLILLVGIWAAWRTKNSGSAEERSEAIIVGGRDIGLLVGGFTMTATWVG 64 

: I : I ::::::: I I : I : I ; I : : : I I I : I I : . I : 

D b 3 IEIIISLGIYFIAMLLIGWYAFKKTTDIND YMLGGRGLGPFVTALSAGAADMS 55 

Qy 65 GGYINGTAEAVYVPGYGLAWAQAPI GYSLSLILGGLFFAKPMRSKGYVTMLDPFQQI 121 

| : I | : : I I : II hi I : : I : I I : 

D b 56 GWMLMGVPGAMFATGLSTLWLALGLTIGAYSNYLLLAPRLRAYTEAADDAITIPDFFDKR 115 

Qy 122 YGKRMGGLLFI PALMGEMFWAAAI FSAL GATI SVI I DVDMHI SVI I SALIATLYTLV 178 

: | : I I : : I : I : I I : : : : : I I I I 

Db 116 FQHSSSLLKIVSALIIMIFFTLYTSSGMVSGGRLFESAFGADYKLGLFLTTAVWLYTLF 175 

Qy 179 GGLYSVAYTDWQLFCIFVGLWISVPFALSHPAVADIGFTAVHAKYQKPWLGTVDSSEVY 238 

I I : I : I I I I : I I : I I : I I I I : I : : 

Db 17 6 GG FLAVS LTD FVQ GAI M FAAL - VLVP I VAFT— HVGGVAPTFHEIDAVNPH 223 

Qy 239 SWLDSF LLLMLGGI PWQAYFQRVLS S S SAT YAQ VLSFLA 277 

M | : : : : : I : I I : : I : I 

Db 224 -LLDIFKGASVISIISYLAWGLGY YGQPHI I VRFMAI KD I KDLKPARRI G 272 

Qy 27 8 AFGCLVMAI PAI LI GAI GASTDWNQTAYGLPDPKTTEEADMI LPIVLQYLCPVYI S FFGL 337 

::.::: I I I I II :: : | | I : I I- I I 

Db 273 MSWMIITVLGSVLTGLIG VAYAHKFGVAVKDPEMIFIIFSKILFHPLITGFLL 325 

Qy 338 GAVSAAVMSSADSSILSASSMFARNIYQLSFRQNASDKEIVWVMRITVFVFGASATAMAL 397 

| : | | : | | | | : I : I : : I : I I : I I I I I : I : I : : I I I : : I 

Db 326 SAI LAAIMS S I S SQLLVTAS AVTEDLYRS FFRRKAS DKELVMI GRLSVLVI AVI AVLLSL 385 

Qy 398 LTKTVYGLWYLSSDLVYIVIF PQLLCVLFVKGTNTYGAVAGYVSG LF 444 

: | : : : | : I : I I : I I : I I : I : I = 

Db 38 6 NP N S T I L D LVG YAWAG FG S AFG PAI LL S L YWKRMN EWGALAAMI VGAAT VL 436 

Qy 445 LRITGGEPYLYLQPLIFYPGYYPDDNGIYNQKFPFKTLAMVTSFLTNICISYLAK 4 99 

: | | | I : I : I : | : I : I : I 

Db 437 IWITTG LAKSTGVY-EIIP GFI LSMI AGI I VSMITK 471 



RESULT 14 
SL53_CANFA 

ID SL53_CANFA STANDARD; PRT; 718 AA. 

AC P31637; 

DT 01-JUL-1993 (Rel. 26, Created) 

DT 01-JUL-1993 (Rel. 26, Last sequence update) 

DT 15-MAR-2004 (Rel. 43, Last annotation update) 

DE Sodium/myo-inositol cotransporter (Na (+) /myo-inositol cotransporter) . 

GN SLC5A3 OR SMIT. 

OS Canis familiaris (Dog) . 

OC Eukaryota; Metazoa; Chordata; Crania ta; Vertebrata; Euteleostomi ; 
OC Mammalia; Eutheria; Carnivora; Fissipedia; Canidae; Canis. 
OX NCBI_TaxID=9615; 
RN [1] 

RP SEQUENCE FROM N.A. 
RC TISSUE=Kidney; 

RX MEDLINE=92210609; PubMed=1372904 ; 

RA Kwon H.M., Yamauchi A., Uchida S., Preston A.S., Garcia-Perez A., 
RA Burg M.B., Handler J.S.; 



RT 


"Cloning 


of the cDNA for a 


Na+/myo-inositol cotransporter , a 


RT 


hypertonicity stress protein."; 


RL 


J. Biol. 


Chem. 267 


: 6297-6301 (1992) . 


CC 


-!- FUNCTION: Prevents intracellular accumulation of high 


CC 


concentrations 


of myo- 


inositol (an osmolyte) that result in 


CC 


impairment of 


cellular 


function. 


CC 


-!- SUBCELLULAR LOCATION: 


Integral membrane protein. 


CC 


TISSUE SPECIFICITY: Brain and kidney. 


CC 


-!- INDUCTION: Medium hypertonicity. 


CC 
CC 
CC 


-!- SIMILARITY: BELONGS TO 


THE SODIUM: SOLUTE SYMPORTER FAMILY (SSF). 


This SWISS-PROT entry is copyright. It is produced through a collaboration 


CC 


between 


the Swiss 


Institute of Bioinf ormatics and the EMBL outstation - 


CC 


the European Bioinf ormatics Institute. There are no restrictions on its 


cc 


use by 


non-profit institutions as long as its content is in no way 


cc 


modified 


and this 


statement is not removed. Usage by and for commercial 


cc 


entities 


requires 


a license agreement (See http://www.isb-sib.ch/announce/ 


cc 
cc 

DR 


or send an email to license@isb-sib.ch). 


EMBL; M85068; NOT ANNOTATED CDS. 


DR 


PIR; A42163; A42163. 




DR 


InterPro; 


IPR001734; Na/solut_symport . 


DR 


Pfam; PF00474; SSF; 1. 




DR 


TIGRFAMs; 


TIGR00813; sss; 


1. 


DR 


PROSITE; 


PS00456; 


NA_SOLUT 


SYMP_1 ; 1 . 


DR 


PROSITE; 


PS00457; 


NA_SOLUT 


_SYMP_2 ; 1 . 


DR 


PROSITE; 


PS50283; 


NA SOLUT 


_SYMP_3; 1. 


KW 


Transport 


; Transmembrane; 


Sodium transport; Symport; Glycoprotein. 


FT 


DOMAIN 


1 


9 


CYTOPLASMIC ( POTENTIAL) . 


FT 


TRANSMEM 


10 


29 


POTENTIAL. 


FT 


DOMAIN 


30 


38 


EXTRACELLULAR ( POTENTIAL ) . 


FT 


TRANSMEM 


39 


57 


POTENTIAL. 


FT 


DOMAIN 


58 


86 


CYTOPLASMIC (POTENTIAL) . 


FT 


TRANSMEM 


87 


110 


POTENTIAL. 


FT 


DOMAIN 


111 


123 


EXTRACELLULAR (POTENTIAL). 


FT 


TRANSMEM 


124 


144 


POTENTIAL. 


FT 


DOMAIN 


145 


157 


CYTOPLASMIC (POTENTIAL) . 


FT 


TRANSMEM 


158 


183 


POTENTIAL . 


FT 


DOMAIN 


184 


186 


EXTRACELLULAR (POTENTIAL). 


FT 


TRANSMEM 


187 


205 


POTENTIAL. 


FT 


DOMAIN 


206 


303 


CYTOPLASMIC ( POTENTIAL) . 


FT 


TRANSMEM 


304 


324 


POTENTIAL. 


FT 


DOMAIN 


325 


353 


EXTRACELLULAR (POTENTIAL) . 


FT 


TRANSMEM 


354 


376 


POTENTIAL. 


FT 


DOMAIN 


377 


406 


CYTOPLASMIC (POTENTIAL). 


FT 
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285 


285 


IMPLICATED IN SODIUM COUPLING 


FT 








(BY SIMILARITY) . 



SQ SEQUENCE 718 AA; 79545 MW; 4C1B5CC4485CD268 CRC64; 



Query Match 9.4%; Score 278.5; DB 1; Length 718; 

Best Local Similarity 23.1%; Pred. No. 3.7e~12; 

Matches 154; Conservative 115; Mismatches 222; Indels 177; Gaps 34; 

Qy 9 IAII-VFYLLILLVGIWAAWRTKNSGSAEERSEAIIVGGRDIGLLVGGFTMTATWVGGG- 66 

III: :| :| |:: I : I : I :| III I 

Db 10 IAIVALYFILVMCIGFFAMWKSNRSTVS GYFLAGRSM — TWVAIGA 53 

Qy 67 — YINGTAEAVYVPGYGLAWAQAPIGYS LSLILGGLFFAKPMRSKGYVTM 114 

: : : : : I I I : I I : : I : I I : I : I I I II 

Db 54 SLFVSNIGSEHFI GLAG S GAAS G FAVGAWE FNAL L LLQ LL GWVF I P I Y I RS - GVYTM 109 

Qy 115 LDPFQQIYGKRMGG LLFI PALMGEMFWAAAI FSALGATI SVI IDVDMH 162 

: I I I I : I : I : : : I : I | : : : : 

Db 110 PEYLSKRFGGHRIQVYFAALSLILYIFTKLSVDLYSGALF IQESLGWNLY 159 

Qy 163 I SVI I SALIATLYTLVGGLYSVAYTDWQLFCI FVG LWISV PFAL 207 

: I I I : : I I : I I I : I I I I : I : I I : I I : : I 

Db 160 VSVI LLI GMTALLTVTGGLVAVI YTDTLQALLMI VGALTLMI I SMMEI GGFEEVKRRYML 219 

Qy 208 SHPAVADIGFTAVHAKYQKPWLGTVDSSEVYSWLDSFLLL -MLGGIP- 253 

: I I I I I | : | | : | : : | : I I I 

Db 220 ASPNVTSILLT YN LSNTNSCNVHPKKDALKMLRNPTDEDVPWPGFVLGQTPA 271 

Qy 254 W Q AY FQ RVL S S S SAT YAQ VLS FLAAFGCLVMAI PAI L 290 

I I | | | | : : : : | : : I I : : : I : : 

Db 272 SWYWCADQVIVQRVLAAKNIAHAKGSTLMAGFLKLLPMFIIWPGMISRILF7VDDIACI 331 

Qy 291 IGAIGASTDWNQTAYGLPDPKTTEEADMILPIVLQYLCPVYISFFGLGAVSAAVM 345 

: I: : II I :: I II : : : I hi 

Db 332 NPEHCMQVCGSRAGCSNIAY PRLVMKLVPVGLRGLMMAVMIAALM 37 6 

Qy 346 S SADSS I LSAS SMFARNI YQLS FRQNASDKEI VWVMRI TV- FVFGASATAMALLTKTVYG 404 

I || I I I : : I : : I : I I : : I I : I : : I I I I I : I : : : : I 
Db 377 S DLDS I FNS AST I FTLDVYKL- 1 RRS AS S RELMI VGRI FVAFMWI S I AWVPI I VEMQGG 435 

Qy 4 05 LWYLSSDLVYIVIFPQL LCVL FVKGTNT YGAVAGYVSGLFLRITGGEPYLYL 4 56 

II I : I : I : I I I I I : I I : I I : I : I : I 

Db 436 QMYL YI QEVADYLT P PVAALFLLAI FWKRCNEQGAFYGGMAGFVLGA- VRLT — LAFAYR 4 92 

Qy 457 QPLIFY PGYYPDDNGI YNQKFPFKTLAMVTSFLTNICISYLAKYLFESGTLPPKLD 512 

I | | : | : : | I I I : I : : I I III: 
Db 493 APECDQPDNRPGFI KDIHYMY VATALFWVTGLIT-VIVSLL TPPPTKE 539 

Qy 513 VFDAWARHS EENMDKTI LVKNENI KLDELALVKPRQSMTLSSTFTNKEAFLDVDS S PEG 572 

I : I ::: I I II :: : : I III : II 

Db 540 QI RTTTFWSKKSLWKESCSPKDEPYKMQEKSILRCSE NSEATNHI — I PNG 589 

Qy 573 SGTEDNLQ 580 

:||::: 

Db 590 K-SEDSIK 596 



RESULT 15 
SL53 BOVIN 



ID SL53_BOVIN STANDARD; PRT; 718 AA. 

AC P53793; 

DT 01-OCT-1996 (Rel. 34, Created) 

DT 01-OCT-1996 (Rel. 34, Last sequence update) 

DT 15-JUL-1998 (Rel. 36, Last annotation update) 

DE Sodium/myo-inositol cotransporter (Na (+) /myo-inositol cotransporter) . 

GN SLC5A3 OR SMIT. 

OS Bos taurus (Bovine) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 

OC Mammalia; Eutheria; Cetartiodactyla ; Ruminantia; Pecora; Bovoidea; 

OC Bovidae; Bovinae; Bos. 

OX NCBI_TaxID=9913; 

RN [1] 

RP SEQUENCE FROM N.A. 

RA Mallee J. J., Parrella T., Kwon H.M., Berry G.T.; 

RL Submitted (APR-1996) to the EMBL/ GenBank/DDBJ databases. 

cc _i_ FUNCTION: Prevents intracellular accumulation of high 

CC concentrations of myo-inositol (an osmolyte) that result in 

CC impairment of cellular function. 

CC -!- SUBCELLULAR LOCATION: Integral membrane protein. 

CC -!- SIMILARITY: BELONGS TO THE SODIUM: SOLUTE SYMPORTER FAMILY (SSF) . 

CC 7~"~ 

CC This SWISS-PROT entry is copyright. It is produced through a collaboration 

CC between the Swiss Institute of Bioinf ormatics and the EMBL outstation - 

CC the European Bioinf ormatics Institute. There are no restrictions on its 

CC use by non-profit institutions as long as its content is in no way 

CC modified and this statement is not removed. Usage by and for commercial 

CC entities requires a license agreement (See http://www.isb-sib.ch/announce/ 

CC or send an email to license@isb-sib . ch) . 

CC 

DR EMBL; U41338; AAA93188.1; 

DR InterPro; IPR001734; Na / solutes ymport . 

DR Pfam; PF00474; SSF; 1. 

DR TIGRFAMs; TIGR00813; sss; 1. 

DR PROSITE; PS00456; NA_SOLUT_SYMP_l ; 1. 

DR PROSITE; PS00457; NA_S0LUT_SYMP_2 ; 1. 

DR PROSITE; PS50283; NA_S0LUT_SYMP_3 ; 1. 

KW Transport; Transmembrane; Sodium transport; Symport; Glycoprotein. 
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FT 








(BY SIMILARITY) . 


SQ 


SEQUENCE 


718 AA; 


79673 


MW; 206BE25FA385111D CRC64; 



Query Match 9.3%; Score 275; DB 1; Length 718; 

Best Local Similarity 22.2%; Pred. No. 6.6e-12; 

Matches 148; Conservative 122; Mismatches 225; Indels 172; Gaps 32; 

Qy 9 IAII-VFYLLILLVGIWAAWRTKNSGSAEERSEAIIVGGRDIGLLVGGFTMTATWVGGG- 66 

I I I : : : : : I : : : I : I i : : I : I : I : I III I 

Db 10 IAIVALYFILVMCIGFFAMWKSNRSTVS GYFLAGRSM — TWVAIGA 53 

Qy 67 — Y I NGTAEAVYVPGYGLAWAQAP I GY S LSLILGGLFFAKPMRSKGYVTM 114 

: : : : : I I I : I I : : I : I I : I : I I I I I 

Db 54 SLFVSNIGSEHFI — -GLAGSGAASGFAVGAWEFNALLLLQLLGWVFIPIYIRS-GVYTM 109 



Qy 115 LDPFQQIYGKRMGG LLFI PALMGEMFWAAAI FSALGATI SVI IDVDMH 162 

: I I I I : I : I : : : I : I I : ::: 

Db 110 PEYLSKRFGGHRIQVYFAALSLILYIFTKLSVDLYSGALF IQESMGWNLY 159 



Qy 163 I SVI I SALIATLYTLVGGLYSVAYTDWQLFCI FVG LWISV PFAL 207 

: I I I : : I I : I I I : I I I I : I : I I : I I : : I 

Db 160 VSVILLIGMTALLTVTGGLVAVIYTDTLQALLMIVGALTLMVISMMEIGGFEEVKRRYML 219 

Qy 208 S H P AVAD I G FT AVHAK YQ K P WL GT VD S S E VY SWLDSFLLL MLGGIP- 253 

: I I I I I I : I I : I : : I : I I I 

Db 220 ASPNVTSILLT YN LSNTNSCNVHPKKDALKMLRNPTDEDVPWPGFILGQTPA 271 



Qy 254 W QA Y FQRVL S S S SAT YAQ VLSFLAAFGCLVMAI PAIL 290 

I I | | | | : : : : I : : I I : : : I : : 

Db 272 SVWYWCADQVIVQRVLAAKNIAHAKGSTLMAGFLKLLPMFIIWPGMISRILFADDIACI 331 

Qy 291 1 GAI GASTDWNQTAYGLPDPKTTEEADMILP I VLQYLCPVYI S FFGLGAVSAAVM 345 

: I : : I I I :: I I I : : : I I : I 

Db 332 NPEHCMQVCGSRAGCSNIAY P RLVMKLVP VGLRGLMMAVMI AALM 376 



Qy 346 S SADS S I LS AS SMFARNI YQLS FRQNAS DKE I VWVMRITV- FVFGAS ATAMALLTKTVYG 404 

I II I I I : : I : : I : I I : : I I : I : : I I I I I : I : : : : I 
Db 377 S DLDS I FN SAST I FTLDVYKL- 1 RKSAS S RELMI VGRI FVAFMWI S I AWVP 1 1 VEMQGG 435 



Qy 4 05 LWYLSSDLVYIVIFPQL LCVLFVKGTNTYGAVAGYVSGLFL RITGGEPYLYLQ 457 

I I I : I : I : I I I I I I :: I I I : I : I 

Db 436 QMYLYIQEVADYLTPPVAALFLIAIFWKRCNEQGAFYGGMAGFILVVVRLT--LAFAYRA 493 

Qy 458 PLIFYPGYYPDDNGIYNQKFPFKTLAMVTSFLTNICISYLAKYLFESGTLPPKLDVFDAV 517 

I ||:::: : I : : I : I : : I III : 
Db 494 P ECDQPDNRPVFIKDIHYMYVATALFWITGL-ITVIVSLL TPPPTKEQI 541 

Qy 518 VARHSEENMDKTILV KNENIKLDELALVK-PRQSMTLSSTFTNKEAFLDVDSSP 570 



Db 542 — RTTTFWSKKSLWKESCSPKDEPYKMQEKSILRCSENSEVINHVI PNGKS EDSI 595 

Qy 571 EGSGTED 577 

: I II 
Db 596 KGLQPED 602 



Search completed: March 22, 2004, 15:32:58 
Job time : 32 sees 



